running_with_more_photons¶
New Opticks : How to Increase the max number of photons/gensteps ?¶
Setting envvars can change the defaults:
OPTICKS_MAX_GENSTEP
OPTICKS_MAX_PHOTON
Those defaults can be overridden in the code by calling SEventConfig static functions prior to calling G4CXOpticks::SetGeometry:
SEventConfig::SetMaxGenstep
SEventConfig::SetMaxPhoton
In addition to that config it is also necessary to create QCurandState files corresponding to the MaxPhoton you set. That can be done using the qudarap-prepare-installation bash function:
qudarap-prepare-sizes-Linux-(){ echo ${OPTICKS_QUDARAP_RNGMAX:-1,3,10} ; }
qudarap-prepare-sizes-Darwin-(){ echo ${OPTICKS_QUDARAP_RNGMAX:-1,3} ; }
qudarap-prepare-sizes(){ $FUNCNAME-$(uname)- | tr "," "\n" ; }
qudarap-prepare-installation()
{
local sizes=$(qudarap-prepare-sizes)
local size
for size in $sizes ; do
QCurandState_SPEC=$size:0:0 QCurandStateTest
rc=$? ; [ $rc -ne 0 ] && return $rc
done
return 0
}
Check your environment with SEventConfigTest binary:
epsilon:~ blyth$ SEventConfigTest
test_EstimateAlloc@20:
SEventConfig::Desc
OPTICKS_EVENT_MODE EventMode : Default
OPTICKS_RUNNING_MODE RunningMode : 0
RunningModeLabel : SRM_DEFAULT
OPTICKS_G4STATE_SPEC G4StateSpec : 1000:38
G4StateSpecNotes : 38=2*17+4 is appropriate for MixMaxRng
OPTICKS_G4STATE_RERUN G4StateRerun : -1
OPTICKS_MAX_GENSTEP MaxGenstep : 1000000
OPTICKS_MAX_PHOTON MaxPhoton : 1000000
OPTICKS_MAX_SIMTRACE MaxSimtrace : 1000000 MaxCurandState : 1000000
OPTICKS_MAX_BOUNCE MaxBounce : 9
MaxBounceNotes : MaxBounceLimit:31, MaxRecordLimit:32 (see sseq.h)
OPTICKS_MAX_RECORD MaxRecord : 0
OPTICKS_MAX_REC MaxRec : 0
OPTICKS_MAX_SEQ MaxSeq : 0
OPTICKS_MAX_PRD MaxPrd : 0
OPTICKS_MAX_TAG MaxTag : 0
OPTICKS_MAX_FLAT MaxFlat : 0
OPTICKS_HIT_MASK HitMask : 64
HitMaskLabel : SD
OPTICKS_MAX_EXTENT MaxExtent : 1000
OPTICKS_MAX_TIME MaxTime : 10
OPTICKS_RG_MODE RGMode : 2
RGModeLabel : simulate
OPTICKS_COMP_MASK CompMask : 262
CompMaskLabel : genstep,photon,hit
OPTICKS_OUT_FOLD OutFold : $DefaultOutputDir
OPTICKS_OUT_NAME OutName : -
OPTICKS_PROPAGATE_EPSILON PropagateEpsilon : 0.0500
OPTICKS_INPUT_PHOTON InputPhoton : -
New Opticks : Background on QCurandState files¶
The QCurandState files allows using curand in the simulation without having to initialize it every time. This is much much faster than calling curand_init everytime. If you initialized curand when simulating the performance would be factors of 100 worse, because the large stack means that far fewer threads can run at the same time.
- Opticks uses curand to generate randoms on device.
- Initializing curand requires calling curand_init, but doing that requires a lot of stack. Far more stack than is needed for ray tracing and simulation. So to avoid having to initialize curand every time, and have large stacks and few threads in flight, this is done at Opticks installation time and the resulting curandState objects are saved to file.
- When simulation is done the QCurandState files are loaded and uploaded to device.
- The QCurandState files are necessary. They do not contain random numbers, they contain initialized curandState objects needed to generate random numbers on device.
I plan at some point to support splitting the QCurandState into multiple smaller files to avoid the inconvenience of very large > 4GB files. And also to avoid having to recreate all the states when you want to extend the size.
Old Opticks¶
Not Enough RNG issue:
-------------------------
Number of Scintillation Photons: 12177
Number of Cerenkov Photons: 3543830
-------->Storing hits in the ROOT file: in this event there are 37929 hits in the tracker chambers:
HC: volTPCActive_lArTPC_HC
###[ EventAction::EndOfEventAction G4Opticks.propagateOpticalPhotons
2020-11-19 09:41:50.832 FATAL [20312] [OpticksEvent::resize@1100] NOT ENOUGH RNG : USE OPTION --rngmax 3/10/100 num_photons 3556007 rng_max 3000000
G4OpticksTest: /home/wenzel/gputest/opticks/optickscore/OpticksEvent.cc:1106: void OpticksEvent::resize(): Assertion `enoughRng && " need to prepare and persist more RNG states up to maximual per propagation number"' failed.
Aborted (core dumped)
--------------------------
Initializing cuRAND requires a large GPU stack size so this is done in separate CUDA launches which are done during installation by cudarap-prepare-installation
epsilon:opticks blyth$ t opticks-prepare-installation
opticks-prepare-installation ()
{
local msg="=== $FUNCNAME :";
echo $msg generating RNG seeds into installcache;
cudarap-;
cudarap-prepare-installation
}
epsilon:opticks blyth$ cudarap-
epsilon:opticks blyth$ t cudarap-prepare-installation
cudarap-prepare-installation ()
{
local size;
cudarap-prepare-sizes | while read size; do
CUDARAP_RNGMAX_M=$size cudarap-prepare-rng-;
done
}
epsilon:opticks blyth$ t cudarap-prepare-rng-
cudarap-prepare-rng- ()
{
local msg="=== $FUNCNAME :";
local path=$(cudarap-rngpath);
[ -f "$path" ] && echo $msg path $path exists already && return 0;
CUDARAP_RNG_DIR=$(cudarap-rngdir) CUDARAP_RNG_MAX=$(cudarap-rngmax) $(cudarap-ibin)
}
Running cudarap-prepare-installation again will just list the curandState files:
[blyth@localhost ~]$ cudarap-
[blyth@localhost ~]$ cudarap-prepare-installation
=== cudarap-prepare-rng- : path /home/blyth/.opticks/rngcache/RNG/cuRANDWrapper_1000000_0_0.bin exists already
=== cudarap-prepare-rng- : path /home/blyth/.opticks/rngcache/RNG/cuRANDWrapper_3000000_0_0.bin exists already
=== cudarap-prepare-rng- : path /home/blyth/.opticks/rngcache/RNG/cuRANDWrapper_10000000_0_0.bin exists already
[blyth@localhost ~]$
Bash functions for creation of larger files are:
597 cudarap-prepare-rng-400M(){ CUDARAP_RNGMAX_M=400 cudarap-prepare-rng- ; }
598 cudarap-prepare-rng-200M(){ CUDARAP_RNGMAX_M=200 cudarap-prepare-rng- ; }
599 cudarap-prepare-rng-100M(){ CUDARAP_RNGMAX_M=100 cudarap-prepare-rng- ; }
600 cudarap-prepare-rng-10M(){ CUDARAP_RNGMAX_M=10 cudarap-prepare-rng- ; }
601 cudarap-prepare-rng-2M(){ CUDARAP_RNGMAX_M=2 cudarap-prepare-rng- ; }
602 cudarap-prepare-rng-1M(){ CUDARAP_RNGMAX_M=1 cudarap-prepare-rng- ; }
The curandState file that is used depends on the –rngmax option, from okc/OpticksCfg.cc it is apparent that the default rngmax is 3M:
113 m_rngmax(3),
114 m_rngmaxscale(1000000),
Assuming you have the 10M already saved you can increase the maximum number of photons Opticks can handle with, eg:
--rngmax 10
It is also possible to change the seed and offset from their defaults of zero with:
--rngseed 1
--rngoffset 42
If 10M photons is insufficient use the below to initialize more curandState slots, eg for 100M:
CUDARAP_RNGMAX_M=100 cudarap-prepare-rng-
When using embedded Opticks withing G4Opticks¶
Typically the executables command line is not parsed by Opticks when using an embedded Opticks as when using G4Opticks. Opticks is instanciated when the G4Opticks::setGeometry method is called, thus to change config of Opticks invoke G4Opticks::setEmbeddedCommandlineExtra prior to calling G4Opticks::setGeometry for example:
const char* extra = "--rngmax 10 --rngseed 1 --rngoffset 42" ;
m_g4ok->setEmbeddedCommandlineExtra(extra);
What is the maximum number of photons that can be handled at once ?¶
The maximum is limited by GPU VRAM. Each photon takes 112 bytes:
- 64 bytes (4*4*4 bytes for 16 32-bit floats/ints) of parameters
- 48 bytes of curandState.
400M photons corresponding to about 45G has been found to be close to the maximum possible when using a 48G VRAM GPU (NVIDIA Quadro RTX 8000).
oxrap/ORng : populates rng_states in the OptiX GPU context¶
032 /**
33 ORng
34 ====
35
36 Uploads persisted curand rng_states to GPU.
37 Canonical instance m_orng is ctor resident of OPropagator.
38
39 Work is mainly done by cudarap-/cuRANDWrapper
40
41 TODO: investigate Thrust based alternatives for curand initialization
42 potential for eliminating cudawrap-
43
44 **/
...
073 void ORng::init()
74 {
75 unsigned rng_max = m_ok->getRngMax();
...
110 // OptiX owned RNG states buffer (not CUDA owned)
111 m_rng_states = m_context->createBuffer( RT_BUFFER_INPUT, RT_FORMAT_USER);
112
113 m_rng_states->setElementSize(sizeof(curandState));
114
115 if(num_mask == 0)
116 {
117 m_rng_states->setSize(rng_max);
118
119 curandState* host_rng_states = static_cast<curandState*>( m_rng_states->map() );
120
121 m_rng_wrapper->setItems(rng_max); // why ? to identify which cache file to load i suppose
122
123 m_rng_wrapper->LoadIntoHostBuffer(host_rng_states, rng_max );
124
125 m_rng_states->unmap();
126 }
127 else
128 {
129 m_rng_states->setSize(num_mask);
130
131 curandState* host_rng_states = static_cast<curandState*>( m_rng_states->map() );
132
133 m_rng_wrapper->setItems(rng_max); // still need to load the full cache
134
135 m_rng_wrapper->LoadIntoHostBufferMasked(host_rng_states, m_mask ) ; // but make partial copy
136
137 m_rng_states->unmap();
138 }
139
140 m_context["rng_states"]->setBuffer(m_rng_states);
141 }
142
oxrap/OPropagator : instanciates ORng¶
65 OPropagator::OPropagator(Opticks* ok, OEvent* oevt, OpticksEntry* entry)
66 :
67 m_log(new SLog("OPropagator::OPropagator","", LEVEL)),
68 m_ok(ok),
69 m_oevt(oevt),
70 m_ocontext(m_oevt->getOContext()),
71 m_context(m_ocontext->getContext()),
72 m_orng(new ORng(m_ok, m_ocontext)),
73 m_propagateoverride(m_ok->getPropagateOverride()),
74 m_nopropagate(false),
75 m_entry(entry),
76 m_entry_index(entry->getIndex()),
77 m_prelaunch(false),