Opticks replaces Geant4 optical photon simulation with an equivalent implementation that benefits from state-of-the-art GPU ray tracing from NVIDIA OptiX, which can yield performance >1000x Geant4.
All optically relevant aspects of Geant4 context are translated+copied to GPU:
- geometry : solids, structure, material+surface properties
- generation : Cerenkov + Scintillation (using Gensteps from Geant4)
- propagation : Rayleigh scattering, absorption, reemission, boundary
COMPLETED : Full Simulation re-implementation for OptiX 7 API
- NEXT : A-B Validation iteration, geometry issue fixes
Integrate NVIDIA OptiX with Geant4
SEvt NumPy arrays : Extreme Detail
: a.genstep : (1, 6, 4) : generation param : a.seed : (10000,) : photon -> genstep : a.inphoton: (10000, 4, 4) : input photon : a.photon : (10000, 4, 4) : final photon param : a.hit : (9633, 4, 4) : hitmasked photon : a.seq : (10000, 2) : history sequence : a.record :(10000, 10, 4, 4) : steppoint record : a.rec :(10000, 10, 2, 4) : compressed record : a.domain : (2, 4, 4) : used for compression : a.sframe : (4, 4, 4) : used for targetting : a.flat : (10000, 64) : random numbers tagged : a.tag : (10000, 4) : compressed tag enum : a.prd :(10000, 10, 2, 4) : per-ray-data isect
Every random consumed by every photon step
A : Opticks : GPU geometry + optical sim.
B : Geant4 simulation
A-B : comparison via python/NumPy analysis
March : Shifted focus, geom issues => optical sim, July : A-B iteration starting
A : Opticks GPU sim, B : Geant4 CPU sim (both fully instrumented: point recording, histories, tagged random consumption)
| pkg | hh/cc/cu/py | Old CUDA Simulation and Visualization |
|---|---|---|
| cudarap | 16/8/3/0 | old CUDA interface |
| thrustrap | 20/2/6/1 | old CUDA thrust interface : photon seeding, indexing |
| optixrap | 43/30/3/2 | old optical photon simulation implemented in old OptiX API |
| okop | 14/10/1/0 | high level pkg on top of optixrap |
| oglrap | 38/29/0/0 | old OpenGL based visualization of opticksgeo geometry |
| opticksgl | 9/5/0/0 | integrating OpenGL with okop OptiX |
| pkg | hh/cc/cu/py | New Geometry model and CUDA simulation |
|---|---|---|
| CSG_GGeo | 3/2/0/0 | GGeo to CSG geometry translation |
| GeoChain | 3/2/0/0 | geometry translation testing |
| CSG | 37/18/0/7 | New geometry model |
| CSGOptiX | 20/14/4/2 | CSG intersection with OptiX 7 |
| qudarap | 43/25/13/1 | CUDA optical photon simulation, CUDA upload, download, textures |
| u4 | 32/10/0/2 | New Geant4 interface, genstep collection, U4Recorder |
| gdxml | 6/4/0/0 | GDML loaded as XML for fixups |
| g4cx | 3/2/0/0 | New top level package integrating Geant4 and CSGOptiX |
green : active development, blue : plan to remove, red : removed
| Old simulation (OptiXRap) | New simulation (QUDARap/qsim.h + CSGOptiX) |
|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
Goals of re-implementation : flexible, modular GPU simulation, easily testable, less code
CSGOptiX built upon:
Geometry + Simulation brought together here
void simulate(const uint3& idx, const uint3& dim)
{
// skip some setup
curandState rng = sim->rngstate[idx] ;
sim->generate_photon(ctx.p, rng, gs, idx, gs_idx );
int command = START ;
int bounce = 0 ;
ctx.point(bounce);
while( bounce < evt->max_bounce )
{
trace( params.handle, ctx.p.pos, ctx.p.mom,
params.tmin, params.tmax, prd);
if( prd->boundary() == 0xffffu ) break ;
ctx.trace(bounce); // record PerRayData intersect
command = sim->propagate(bounce, rng, ctx);
bounce++;
ctx.point(bounce) ;
if(command == BREAK) break ;
}
ctx.end(); // write seq, tag, flat
evt->photon[idx] = ctx.p ; // final photon
}
https://bitbucket.org/simoncblyth/opticks/src/master/CSGOptiX/CSGOptiX7.cu
Strong Code Reduction Potential
Several pkgs only used by geo translation, not sim
Longer term: revisit geometry translation + viz
Same translation in less code:
| pkg | hh/cc/cu/py | Base |
|---|---|---|
| okconf | 3/2/0/0 | config version detection |
| sysrap | 150/70/1/8 | basis types, new array NP.hh |
| boostrap | 46/42/0/0 | boost tools |
| npy | 181/165/0/6 | geo primitives, old array NPY.hpp |
| optickscore | 70/62/0/1 | old core, argument parsing |
| pkg | hh/cc/cu/py | Geometry |
|---|---|---|
| ggeo | 68/65/0/2 | complete geometry model : no Geant4 dependency |
| extg4 | 62/53/0/0 | Geant4 geometry translation into GGeo model |
| opticksgeo | 10/6/0/0 | high level pkg on top of ggeo |
| cfg4 | 126/117/0/3 | Old Geant4 comparison machinery, eg CRecorder |
green : active development, blue : plan to remove, red : removed
sysrap/stag.h enum
enum {
stag_undef = 0,
stag_to_sci = 1,
stag_to_bnd = 2,
stag_to_sca = 3,
stag_to_abs = 4,
stag_at_burn_sf_sd = 5,
stag_at_ref = 6,
stag_sf_burn = 7,
stag_sc = 8,
stag_to_ree = 9,
stag_re_wl = 10,
stag_re_mom_ph = 11,
stag_re_mom_ct = 12,
stag_re_pol_ph = 13,
stag_re_pol_ct = 14,
stag_hp_ph = 15
//stag_hp_ct = 16
}; // 4 bit tags pack better
qudarap/qsim.h:
459 inline QSIM_METHOD int qsim::propagate_to_boundary(unsigned& flag,
curandStateXORWOW& rng, sctx& ctx)
460 {
462 const sstate& s = ctx.s ;
464 const float& absorption_length = s.material1.y ;
465 const float& scattering_length = s.material1.z ;
471 #ifdef DEBUG_TAG
472 float u_to_sci = curand_uniform(&rng) ; // burn to align
473 float u_to_bnd = curand_uniform(&rng) ; // burn to align
474 #endif
475 float u_scattering = curand_uniform(&rng) ;
476 float u_absorption = curand_uniform(&rng) ;
478 #ifdef DEBUG_TAG
479 stagr& tagr = ctx.tagr ;
480 tagr.add( stag_to_sci, u_to_sci);
481 tagr.add( stag_to_bnd, u_to_bnd);
482 tagr.add( stag_to_sca, u_scattering);
483 tagr.add( stag_to_abs, u_absorption);
491 #endif
489 float scattering_distance = -scattering_length*logf(u_scattering);
490 float absorption_distance = -absorption_length*logf(u_absorption);
...
sysrap/stag.h/stagr collects into tag.npy flat.npy : A-side straightforward as already control all code
u4/U4Stack.h enum
enum {
U4Stack_Unclassified = 0,
U4Stack_RestDiscreteReset = 1,
U4Stack_DiscreteReset = 2,
U4Stack_ScintDiscreteReset = 3,
U4Stack_BoundaryDiscreteReset = 4,
U4Stack_RayleighDiscreteReset = 5,
U4Stack_AbsorptionDiscreteReset = 6,
U4Stack_BoundaryBurn_SurfaceReflectTransmitAbsorb = 7,
U4Stack_BoundaryDiDiTransCoeff = 8,
U4Stack_AbsorptionEffDetect = 9,
U4Stack_RayleighScatter = 10,
U4Stack_BoundaryDiMeReflectivity = 11,
U4Stack_ChooseReflection = 12,
U4Stack_RandomDirection = 13,
U4Stack_LambertianRand = 14,
U4Stack_Reemission = 15 // 4-bit better
};
u4/InstrumentedG4OpBoundaryProcess.hh
u4/InstrumentedG4OpBoundaryProcess.cc
u4/tests/ShimG4OpRayleigh.hh
u4/tests/ShimG4OpRayleigh.cc
u4/tests/ShimG4OpAbsorption.hh
u4/tests/ShimG4OpAbsorption.cc
u4/tests/DsG4Scintillation.h
u4/tests/DsG4Scintillation.cc
u4/tests/ShimG4OpAbsorption.cc:
32 void ShimG4OpAbsorption::ResetNumberOfInteractionLengthLeft()
33 {
35 G4double u = G4UniformRand() ;
36 SEvt::AddTag( U4Stack_AbsorptionDiscreteReset, u );
45 theNumberOfInteractionLengthLeft = -1.*G4Log(u) ;
49 }
882 void SEvt::addTag(unsigned tag, float flat){
883 {
885 stagr& tagr = current_ctx.tagr ;
899 tagr.add(tag,flat) ;
929 }
sysrap/SEvt.cc:SEvt::AddTag using same stagr struct as A
u4/U4Process::ClearNumberOfInteractionLengthLeft
Aligned : same u, same purpose
epsilon:~ blyth$ gx
/Users/blyth/opticks/g4cx
epsilon:g4cx blyth$ ./ab.sh
In [1]: AB(54)
Out[1]:
A(54) : TO BT BR BR BT SA B(54) : TO BT BR BR BT SA
A.t : (1000, 64) B.t : (1000, 64)
A.t2 : (1000, 64) B.t2 : (1000, 64)
0 : 0.7082 : 1 : to_sci 0 : 0.7082 : 3 : ScintDiscreteReset :
1 : 0.0797 : 2 : to_bnd 1 : 0.0797 : 4 : BoundaryDiscreteReset :
2 : 0.1970 : 3 : to_sca 2 : 0.1970 : 5 : RayleighDiscreteReset :
3 : 0.4009 : 4 : to_abs 3 : 0.4009 : 6 : AbsorptionDiscreteReset :
4 : 0.3778 : 5 : at_burn_sf_sd 4 : 0.3778 : 7 : BoundaryBurn_SurfaceReflectTransmitAbsorb :
5 : 0.7441 : 6 : at_ref 5 : 0.7441 : 8 : BoundaryDiDiTransCoeff :
...
32 : 0.2452 : 1 : to_sci 32 : 0.2452 : 3 : ScintDiscreteReset :
33 : 0.5043 : 2 : to_bnd 33 : 0.5043 : 4 : BoundaryDiscreteReset :
34 : 0.1788 : 3 : to_sca 34 : 0.1788 : 5 : RayleighDiscreteReset :
35 : 0.8004 : 4 : to_abs 35 : 0.8004 : 6 : AbsorptionDiscreteReset :
36 : 0.3335 : 5 : at_burn_sf_sd 36 : 0.3335 : 7 : BoundaryBurn_SurfaceReflectTransmitAbsorb :
37 : 0.7170 : 7 : sf_burn 37 : 0.7170 : 9 : AbsorptionEffDetect :
38 : 0.0000 : 0 : undef 38 : 0.0000 : 0 : Unclassified :
A : /tmp/blyth/opticks/RaindropRockAirWaterSmall/G4CXSimulateTest/ALL
B : /tmp/blyth/opticks/RaindropRockAirWaterSmall/U4RecorderTest/ShimG4OpAbsorption_FLOAT_ShimG4OpRayleigh_FLOAT/ALL
In [2]:
sm = np.all( a.seq[:,0] == b.seq[:,0] ) : True
ww = np.where( a.seq[:,0] != b.seq[:,0] )[0] : [] History alignment : can be accidental
we = np.where( A.t.view('|S64') != B.t2.view('|S64') )[0] : [] Tag enum alignment : much more stringent
match sequences of 64 tags by viewing as 64 char string
epsilon:~ blyth$ ~/opticks/g4cx/ab.sh sm = np.all( a.seq[:,0] == b.seq[:,0] ) : True ## all 1M histories match we = np.where( A.t.view('|S64') != B.t2.view('|S64') )[0] : [ 41595 114799 125032 158475 174993 243023 ... ] In [2]: seqhis_(a.seq[we,0]) ## 22/1M not enum aligned : all with truncated record Out[2]: ['TO BT SC BR BR BR BR BR BR BR', 'TO BT SC BR BR BR BR BR BR BR', ... In [4]: wm = np.where( A.t.view('|S64') == B.t2.view('|S64') )[0] ; wm, len(wm) ## select the A-B enum aligned Out[4]: array([ 0, 1, 2, 3, ..., 999995, 999996, 999997, 999998, 999999]), 999978 In [29]: np.abs(a.record[wm] - b.record[wm]).max() ## max deviation : 0.018 mm Out[29]: 0.018722534 In [30]: np.where( np.abs(a.record[wm] - b.record[wm]) > 0.01 ) ## select photon step points with deviation > 0.01 Out[30]: (array([ 18157, 125121, 467717, 499537, 624529, 695184, 759091, 779861, 938053]), ## photon index array([1, 1, 3, 1, 1, 2, 4, 1, 1]), ## step point index array([0, 0, 0, 0, 0, 0, 0, 0, 0]), ## first photon row is position, time array([2, 2, 2, 2, 2, 0, 2, 2, 2])) ## 2:Z 0:X coordinate In [31]: dv = wm[np.unique(np.where( np.abs(a.record[wm] - b.record[wm]) > 0.01 )[0])] ## deviant indices In [36]: seqhis_(a.seq[dv,0]) ## deviant step point mostly AB/SC position Out[36]: ['TO AB','TO AB','TO BT BT AB','TO AB','TO AB','TO BR AB','TO SC BT BT SA','TO AB','TO AB']
In [4]: AB(we[0]) ## all 22/1M loose alignment at 1st BR after SC Out[4]: A(41595) : TO BT SC BR BR BR BR BR BR BR B(41595) : TO BT SC BR BR BR BR BR BR BR 0 : 0.6518 : 1 : to_sci 0 : 0.6518 : 3 : ScintDiscreteReset : 1 : 0.3244 : 2 : to_bnd 1 : 0.3244 : 4 : BoundaryDiscreteReset : 2 : 0.2309 : 3 : to_sca 2 : 0.2309 : 5 : RayleighDiscreteReset : 3 : 0.7327 : 4 : to_abs 3 : 0.7327 : 6 : AbsorptionDiscreteReset : 4 : 0.1133 : 5 : at_burn_sf_sd 4 : 0.1133 : 7 : BoundaryBurn_SurfaceReflectTransmitAbsorb : 5 : 0.6275 : 6 : at_ref 5 : 0.6275 : 8 : BoundaryDiDiTransCoeff : 6 : 0.4789 : 1 : to_sci 6 : 0.4789 : 3 : ScintDiscreteReset : 7 : 0.6990 : 2 : to_bnd 7 : 0.6990 : 4 : BoundaryDiscreteReset : 8 : 0.9998 : 3 : to_sca 8 : 0.9998 : 5 : RayleighDiscreteReset : 9 : 0.0067 : 4 : to_abs 9 : 0.0067 : 6 : AbsorptionDiscreteReset : 10 : 0.1797 : 8 : sc 10 : 0.1797 : 10 : RayleighScatter : 11 : 0.0088 : 8 : sc 11 : 0.0088 : 10 : RayleighScatter : 12 : 0.5316 : 8 : sc 12 : 0.5316 : 10 : RayleighScatter : 13 : 0.8436 : 8 : sc 13 : 0.8436 : 10 : RayleighScatter : 14 : 0.4477 : 8 : sc 14 : 0.4477 : 10 : RayleighScatter : 15 : 0.4004 : 1 : to_sci 15 : 0.4004 : 3 : ScintDiscreteReset : 16 : 0.8328 : 2 : to_bnd 16 : 0.8328 : 4 : BoundaryDiscreteReset : 17 : 0.0016 : 3 : to_sca 17 : 0.0016 : 5 : RayleighDiscreteReset : 18 : 0.2059 : 4 : to_abs 18 : 0.2059 : 6 : AbsorptionDiscreteReset : 19 : 0.5832 : 5 : at_burn_sf_sd 19 : 0.5832 : 7 : BoundaryBurn_SurfaceReflectTransmitAbsorb : 20 : 0.7585 : 6 : at_ref <----- ALIGNMENT LOST HERE : DO NOT GET EXPECTED TAG : BoundaryDiDiTransCoeff 20 : 0.7585 : 3 : ScintDiscreteReset : 21 : 0.6396 : 1 : to_sci 21 : 0.6396 : 4 : BoundaryDiscreteReset : 22 : 0.9837 : 2 : to_bnd 22 : 0.9837 : 5 : RayleighDiscreteReset :
SC/AB distance = -length*logf(u)
SC/AB lengths often large > 1e6 mm
small logf(u) deviations scaled up
=> SC/AB position difference
SC+AB compete on distance with reaching boundary
=> when SC/AB win : u > 0.998
-log(u) u > 0.998 -log(1-x) ~= x x < 0.002
"Raindrop" geometry : 50mm radius sphere of water in air
Previously assumed float/double difference between Opticks/Geant4
sysrap/tests/logTest.cu : CUDA comparison
Deviations small,
Using KLUDGE_FASTMATH_LOGF reduces SC/AB position deviation
##define KLUDGE_FASTMATH_LOGF(u) (u < 0.998f ? __logf(u) : __logf(u) - 0.46735790f*1e-7f )
Full geom, but target single PMT
Input photons simplify random consumption
Configure geometry and input photons:
geom_ oip
Run A and B simulations and "T" simtrace running:
cd ~/opticks/g4cx ./gxs.sh ## A ./gxt.sh ## T collect "simtrace" arrays of intersect cd ~/opticks/u4 ./u4s.sh ## B
gxt.sh uses "simtrace" arrays to make 2D slice plots (pyvista, matplotlib)
cd ~/opticks/g4cx ./gxt.sh ana ./ab.sh # compare A and B with NumPy
"simtrace" : scatter plot of intersects, looks like missed ellipsoid transform
gx ; EYE=0,1000,200 LOOK=0,0,200 ./ab.sh recplot
single PMT test : no-such missed ellipsoid transform issue
Full GDML geometry tests + Standalone tests
First step to investigate issues:
Aligned A-B validation iteration
Very powerful tool for finding geometry issues
Non-aligned A-B iteration : statistical comparison
Update JUNO-Opticks integration to use new Opticks
| JUNO Simulation -> Opticks -> GPU | |
| Benefits: |
|
| Status: |
|
Next Steps Summary
Changed gears:
Many Thanks to : Yuxiang Hu
| https://bitbucket.org/simoncblyth/opticks | code repository |
| https://simoncblyth.bitbucket.io | presentations and videos |