Opticks > 1000x Geant4 (*)
GPU massive parallelism eliminates bottleneck.
[zero: effectively, compared to rest of simulation]
More Photons -> More Benefit
http://bitbucket.org/simoncblyth/opticks
(*) core extrapolated from mobile GPU speed
Status Nov 2016, LLR [1]
CAVEAT: Analytic required for chi2 match
Geometry Status (Nov 2016)
Geometry Status (July 2017)
[1] LLR Workshop, https://simoncblyth.bitbucket.io/env/presentation/opticks_gpu_optical_photon_simulation_nov2016_llr.html
CSG Binary Tree
Primitives combined via binary operators
Simple by construction definition, implicit geometry.
CSG expressions
3D Parametric Ray : ray(t) = r0 + t rDir
Ray Geometry Intersection
How to pick exactly ?
In/On/Out transitions
Classical Roth diagram approach
Computational requirements:
BUT : High performance on GPU requires:
Classical approach not appropriate on GPU
Outside/Inside Unions
dot(normal,rayDir) -> Enter/Exit
Union, tA < tB | Enter B | Exit B | Miss B |
---|---|---|---|
Enter A | ReturnA | LoopA | ReturnA |
Exit A | ReturnA | ReturnB | ReturnA |
Miss A | ReturnB | ReturnB | ReturnMiss |
Union, tB < tA | Enter B | Exit B | Miss B |
---|---|---|---|
Enter A | ReturnB | ReturnB | ReturnA |
Exit A | LoopB | ReturnA | ReturnA |
Miss A | ReturnB | ReturnB | ReturnMiss |
Recursive CSG tree python prototype of Kensler pseudocode worked after state table corrections/extensions
Bit Twiddling Navigation
CSG Tree, leaf node primitives, internal node operators, 4x4 transforms on any node, serialized as complete binary tree:
Height 3 complete binary tree with level order indices:
depth elevation 1 0 3 10 11 1 2 100 101 110 111 2 1 1000 1001 1010 1011 1100 1101 1110 1111 3 0
postorder_next(i,elevation) = i & 1 ? i >> 1 : (i << elevation) + (1 << elevation) ; // from pattern of bits
Postorder tree traverse visits all nodes, starting from leftmost, such that children are visited prior to their parents.
fullTree = PACK( 1 << height, 1 >> 1 ) // leftmost, parent_of_root(=0) tranche.push(fullTree, ray.tmin) while (!tranche.empty) // stack of begin/end indices { begin, end, tmin <- tranche.pop ; node <- begin ; while( node != end ) // over tranche of postorder traversal { elevation = height - TREE_DEPTH(node) ; if(is_primitive(node)){ isect <- intersect_primitive(node, tmin) ; csg.push(isect) } else{ i_left, i_right = csg.pop, csg.pop // csg stack of intersect normals, t l_state = CLASSIFY(i_left, ray.direction, tmin) r_state = CLASSIFY(i_right, ray.direction, tmin) action = LUT(operator(node), leftIsCloser)(l_state, r_state) if( action is ReturnLeft/Right) csg.push(i_left or i_right) else if( action is LoopLeft/Right) { left = 2*node ; right = 2*node + 1 ; endTranche = PACK( node, end ); leftTranche = PACK( left << (elevation-1), right << (elevation-1) ) rightTranche = PACK( right << (elevation-1), node ) loopTranche = action ? leftTranche : rightTranche tranche.push(endTranche, tmin) tranche.push(loopTranche, tminAdvanced ) // subtree re-traversal with changed tmin break ; // to next tranche } } node <- postorder_next(node, elevation) // bit twiddling postorder } } isect = csg.pop(); // winning intersect
https://bitbucket.org/simoncblyth/opticks/src/tip/optixrap/cu/csg_intersect_boolean.h
Closed Solid as: implementation requires otherside intersect, Rigidly attached normals
Type code | Python name | C++ nnode sub-struct |
---|---|---|
CSG_BOX3,CSG_BOX | box3,box | nbox |
CSG_SPHERE,CSG_ZSPHERE | sphere,zsphere | nsphere,nzsphere |
CSG_CYLINDER,CSG_DISC | cylinder,disc | ncylinder,ndisc |
CSG_CONE | cone | ncone |
CSG_CONVEXPOLYHEDRON | convexpolyhedron | nconvexpolyhedron |
CSG_TRAPEZOID,CSG_SEGMENT | trapezoid,segment | nconvexpolyhedron |
Non-primitives, high level CSG definition avoids loadsa code
Three on right (Trapezoid, Segment, Icosahedron) all represented by planes with nconvexpolyhedron
// tboolean-parade
from opticks.ana.base import opticks_main
from opticks.analytic.csg import CSG
args = opticks_main(csgpath="$TMP/$FUNCNAME")
container = CSG("box", param=[0,0,0,1200], boundary=args.container, poly="MC", nx="20" )
a = CSG("sphere", param=[0,0,0,100])
b = CSG("zsphere", param=[0,0,0,100], param1=[-50,60,0,0])
c = CSG("box3",param=[100,50,70,0])
d = CSG("box",param=[0,0,10,50])
e = CSG("cylinder",param=[0,0,0,100], param1=[-100,100,0,0])
f = CSG("disc",param=[0,0,0,100], param1=[-1,1,0,0])
g = CSG("cone", param=[100,-100,50,100])
h = CSG.MakeTrapezoid(z=100, x1=80, y1=100, x2=100, y2=80)
i = CSG.MakeSegment(phi0=0,phi1=45,sz=100,sr=100)
j = CSG.MakeIcosahedron(scale=100.)
prims = [a,b,c,d,e,f,g,h,i,j]
... // setting translations
CSG.Serialize([container] + prims, args.csgpath ) <-- write trees to file
OptiX Geometry
OptiX provides acceleration of geometrical intersection, not the intersection itself.
Intersect finding next closest root:
OptiX/CUDA functions providing:
C++/nnode sub-struct methods
4x4 Transforms on any node (translation/rotation/scaling)
Intersect inverse-transformed ray with un-transformed primitive
Supporting non-uniform scaling requires rayDir not be be normalized (or assumed to be normalized) by primitives.
Positive form CSG Trees
Apply deMorgan pushing negations down tree
End with only UNION, INTERSECT operators, and some complemented leaves.
COMMUTATIVE -> easily rearranged
Intended for solids, not scenes (tree height <8, <256 nodes[*])
Dayabay TopESRCutHols lvidx:57 (height:9 totnodes:1023) di(di(di(di(di(di(di(di(di(cy,cy),cy),cy),cy),cy),cy),cy),cy),cy) di di cy di cy di cy di cy di cy di cy di cy di cy cy cy Balanced Tree, height:4 totnodes:31 in(in(in(in(cy,!cy),in(!cy,!cy)),in(in(!cy,!cy),in(!cy,!cy))),!cy) in in !cy in in in in in in cy !cy !cy !cy !cy !cy !cy !cy
[*] Algorithm has no inherent height limit, but use of complete binary tree imposes practical performance limitation
CSG_DISC implemented to handle disc like cylinders : intersects at middle (z1+z2)/2 and offsets, avoids issue
Cylinder - Cone
Coincident endcaps -> issue
Grow subtracted cone downwards, avoids coincidence : does not change composite solid
Coincidences common (alignment too tempting?). To fix:
SDF definition
SDF(q(x,y,z) distance to solid surface
Defined for all primitives, eg:
Sphere(center,radius) : length(q,center) - radius Plane(normal, dist) : dot(normal,q) - dist
SDFs are composable using R-functions, eg min and max
Recursively applicable to CSG trees
-> analytic representation of arbitrary solid
SDFs enable geometry querying/debugging:
Rvachev-functions (R-functions) : sign uniquely determined by signs of arguments -> parallel CSG operations
Fast (<10s) Iteration
gdml2gltf.py generates for every solid:
"Interactively" modify any solids (249 DYB.Near):
NScanTest : Find Problem Solids by SDF Scanning
Count SDF zero crossings along scanlines (expect even)
~5/249 odd solids : not yet handled by auto-uncoincidence
GScene::compareMeshes : Bounding Box comparisons
G4Poly(via G4DAE) vs Opticks parsurf (parametric surface)
NEXT STEP: direct intersect comparison between Opticks/G4 CSG implementations for definitive answers
(*) could also be G4DAE export of G4Poly
3D File Formats
General CSG ray tracing on GPU allows operation from GDML files
gdml2gltf.py (opticks.analytic.sc)
NScene
NCSG
GScene/OScene
glTF : (GL Transmission Format) "JPEG of 3D" : efficient transmission and loading of 3D scenes and models, https://www.khronos.org/gltf
GPU Raytrace on following pages are for Daya Near Site Geometry, auto translated from GDML
General Translation Machinery
Scene Debugging : NSceneLoadTest
Overview
Opticks enables Geant4 based simulations to benefit from optical photon simulation taking effectively zero time and zero CPU memory, due to massive parallelism made accessible by NVIDIA OptiX. GDML detector geometry is auto translated into a GPU optimized analytic form, fully equivalent to the source geometry.
- The more photons the bigger the overall speedup (99% -> 100x)
- Drastic speedup -> better detector understanding -> greater precision
- Large PMT based neutrino experiments, such as JUNO, can benefit the most
- Optical Photon Simulation Problem
- Ray Traced Image Synthesis ≈ Optical Photon Simulation
- Open Source Opticks
- YouTube Video
For more on Opticks see
https://bitbucket.org/simoncblyth/opticks
[1] LLR Workshop, https://simoncblyth.bitbucket.io/env/presentation/opticks_gpu_optical_photon_simulation_nov2016_llr.html
Overview
Opticks enables Geant4 based simulations to benefit from optical photon simulation taking effectively zero time and zero CPU memory, thanks to massive parallelism made accessible by NVIDIA OptiX.
- The more photons the bigger the overall speedup (99% -> 100x)
- Drastic speedup -> better detector understanding -> greater precision
- Large PMT based neutrino experiments, such as JUNO, can benefit the most
Optical Photon Problem
Optical photons:
Isolated nature -> easily separated propagation
Hybrid Solution Possible : Geant4 + Opticks
OptiX Pixel Calculation
Open Source Opticks
Documentation, install instructions. Repository.
Geometry/event data use NumPy serialization:
import numpy as np a = np.load("photons.npy")
(*) Windows VS2015, non-CUDA only so far
Switching subtraction into union with complemented -> can see whats subtracted.
Polygonization that just works ?
Not a solved problem, problems with some geometry
Several Integrations of Open Source Polygonizations
My own attempt (WIP)