Status of JUNOSW + Opticks : GPU ray trace accelerated optical photon simulation

Status of JUNOSW + Opticks :
GPU ray trace accelerated optical photon simulation

Open source, https://github.com/simoncblyth/opticks

Simon C Blyth, IHEP, CAS — JUNO Collaboration Meeting, Wuhan — 22 January 2026


Outline

newtons-opticks.png





(JUNO) Optical Photon Simulation Problem...

Opticks solves this using GPU ray tracing via NVIDIA OptiX


Geant4 + Opticks + NVIDIA OptiX : Hybrid Workflow

Opticks enables Geant4 based simulation to offload optical photon simulation to the GPU

NVIDIA GPU ray tracing of billions[1] of rays per second applied to optical simulation

[1] Actual performance depends on geometry and its modelling, JUNO optical simulation speedups > 1000x Geant4 have been measured


OJ : Opticks+JUNOSW Automated Gitlab-CI/CD Releases

Use Opticks+JUNOSW latest release J25.7.2_Opticks-v0.5.6 by sourcing envset.sh, eg:

source /cvmfs/opticks.ihep.ac.cn/oj/releases/J25.7.2_Opticks-v0.5.6/el9_amd64_gcc11/2026_01_08/envset.sh
l /cvmfs/opticks.ihep.ac.cn/oj/releases/J25.7.2_Opticks-v0.5.6/el9_amd64_gcc11/
lrwxrwxrwx. 1 cvmfs cvmfs   3 Jan 15 17:17 Latest -> Thu
drwxr-xr-x. 7 cvmfs cvmfs 181 Jan 15 17:10 Thu
drwxr-xr-x. 7 cvmfs cvmfs 181 Jan 14 17:12 Wed
drwxr-xr-x. 7 cvmfs cvmfs 181 Jan 13 17:12 Tue
drwxr-xr-x. 7 cvmfs cvmfs 181 Jan 12 17:11 Mon
drwxr-xr-x. 7 cvmfs cvmfs 181 Jan 11 17:11 Sun
drwxr-xr-x. 7 cvmfs cvmfs 181 Jan 10 17:12 Sat
drwxr-xr-x. 7 cvmfs cvmfs 181 Jan  9 17:11 Fri
lrwxrwxrwx. 1 cvmfs cvmfs  10 Jan  8 17:19 LastRef -> 2026_01_08
drwxr-xr-x. 7 cvmfs cvmfs 181 Jan  8 17:12 2026_01_08

Opticks Enhancements : for high photon count events

Integration + iteration directed by production experience with Opticks + JUNOSW

Reduced resource simulation using summary "Muon" hits --pmt-hit-type 2

Removed 32-bit max photon limits -> simulation of giga optical photon events

CUDA implementation of PMT hit merging (thrust::sort_by_key,reduce_by_key)


GEOM_J25_4_0_opticks_Debug_cxr_min_muon_cxs_20250707_112242.png

EVT=muon_cxs cxr_min.sh #12 : photons from muon crossing JUNO Scintillator


GEOM_J25_4_0_opticks_Debug_cxr_min_muon_cxs_20250707_112243.png

EVT=muon_cxs cxr_min.sh #13


GEOM_J25_4_0_opticks_Debug_cxr_min_muon_cxs_20250707_112244.png

EVT=muon_cxs cxr_min.sh #14

Large counts motivate merging of hits with same (pmtid,timebucket) : adding counts and keeping earliest time


GPU Hit Merging : High Level Parallelization with CUDA Thrust

struct key_functor {   //  Bitwise-OR (pmtid,timebucket) 
  float    timewindow;
  uint64_t operator()(const sphotonlite& p) const // 16+48 = 64
  {
     return (uint64_t(p.identity()) << 48) | uint64_t(p.time/timewindow);
  }
};

Opticks/sysrap SPM::merge_partial_select using CUDA Thrust (higher level C++ way to use CUDA)

Thrust method Action Note
copy_if photon -> hit using flagmask
transform hit -> key bitwise-OR (pmtid, timebucket)
sort_by_key hit, key -> hit hit ordered with same (pmtid,timebucket) contiguous
reduce_by_key hit, key -> hitmerged merge two hit : earlier time, sum hitcount

https://github.com/simoncblyth/opticks/blob/master/sysrap/SPM.cu

https://github.com/simoncblyth/opticks/blob/master/sysrap/sphotonlite.h


GPU Hit Merging : Let Opticks Shine Bright [by not obscuring it]

Detsim timings for one double muon event, ~150M photons, 28M hit, 6.4M mergedHit, 1ns bucket merge

JUNOSW J_Std1,2 --pmt-hit-type 1 --pmt-hit-type 2 Total processing time excluding Initialize
--opticks-mode 0 7112 s (118min) 6904 s (115min) ]junoSD_PMT_v2::Initialize → [junoSD_PMT_v2::EndOfEvent
Opticks+J [1] (CPUMerge)+Coll.[s] Kernel [s] PREL→POST
(GPUMerge)+Download
POST→DOWN [s]
Total (no-init) [s]
HEAD→RESET
Speedup vs
J_Std1,2
hit   190.445   22.996   1.949   215.560     x32
hitmerged       6.712   22.988   0.543     30.400   x233
hitlite   146.471   23.108   0.484   170.226     x31
hitlitemerged       0.403   23.097   0.181     23.835   x221

Opticks+J : overall speedup > x200 [~2 hrs → ~30 s]

[1] : Workstation (Dell Precision 7960), NVIDIA RTX 5000 Ada, 3rd gen. RT cores, 32 GB

https://code.ihep.ac.cn/blyth/j/-/blob/main/zhenning_double_muon/detsim.sh


II. Work status this year : Completion of research tasks (II)

Opticks+ JUNOSW validations reveal JUNO geometry bugs, including:

Integration of Opticks+JUNOSW with gitlab CI/CD build/test/release

OpticksService : first implementation of HTTP optical server working

OpticksClient : almost test stage [CPU node : collects gensteps, libcurl request, response]


Geant4 + Opticks + NVIDIA OptiX : Hybrid Workflow 4x4 ?

Geant4 + Opticks + NVIDIA OptiX : Hybrid Workflow 4x4?

Production running via "Monolithic" scaling : inefficient use of scarce GPU resources ?


OpticksClients + OpticksService : Share GPUs


Client.png

How many clients ? Depends on server, event photon count, network, ... (Experimentation needed)


Split Workflow : Share GPUs between OpticksClients

OpticksClient : Detector Simulation Framework (Geant4 etc..) GPU
  • U4.h : collect gensteps
  • NP_CURL.h : HTTP POST (libcurl)
    • request : genstep array
    • response : hits array
OpticksService : CSGOptiX + NVIDIA OptiX + GPU
  • FastAPI : ASGI python web framework (alt: sanic, aiohttp) grandmetric.com/python-rest-frameworks-performance-comparison/
  • nanobind : python <=> C++ (alt: pybind11), uv : pip
  • CSGFoundry.h : load persisted geometry
  • CSGOptiXService.h : simulate
Prototype clients + service under development
  • scale up to MC production ? ~100/1000 clients ?
  • use multi-GPU to serve more clients ?

OR C++ web framework eg:
  • binding, less standard

Water Distributer (WD) implementation from Peidong

Complex solids, between Water Pool and CD, many R+T transforms, multi-union, long way from origin

Multiple issues:

Branches :

yupd-water-distributor

yupd_bottompipe_adjust

yupd_waterdistributor_heightfix

WDP 89middle

Lower and Upper WD

WDP 3

WDP 3 : Uppermost Water Distributor in Water Pool

WDP 4

WDP 4 : Water Distributor at top of CD

WDP 5

WDP 5 : mid-CD

WDP 6

WDP 6 : Water Distributor at bottom of CD

WDP 7

WDP 7 :  Water Distributor at bottom of pool

Water Distributer Issues (FIXED)

Raytrace directly reveals (MR1062):

LOROS : simtrace boundary wide view at top

simtrace boundary wide view at top

LOROS : simtrace wide bottom view 20251218_100415

simtrace wide bottom view 20251218_100415

LOROS : simtrace primtab : snake thru the dead zone with normals

simtrace primtab : snake thru the dead zone with normals

LWDS : Simtrace Split Snake

cxt_min.sh : Simtrace Split Snake

LWDS : xz simtrace showing the mis-placed gap

XZ plane simtrace showing the mis-placed gap

MOI=264.136,-566.442,-20475,1500  # XY:middle-of-pipe Z:middle-of-outer-tyvek

LWDS : Legs still not segmented at bottom

cxt_min.sh : Legs still not segmented at bottom

MPMT 8-inch implementation from Peidong

Working with Peidong on MR 1063

MPMT 19,20,21,construction

MPMT (branch yupd-8inch-pmt)

MPMT : 16

MPT in context

MPMT : 18

wide view showing added PMTs

EMF Coils implementation from Chen Jing

Working with Chen Jing on MR 1086

Currently using G4Polycone with phi ranges

EMF2 2

ELV=^s_EMF MOI=/tmp/emf2.npy EYE=0,0,-2 UP=0,1,0 cxr_min.sh  ## raytrace

ELV : select solids

MOI : pick frame

View along EMF axis : (theta,phi) (56,-54) degrees

EMF2 3

EMF2 3 View from bottom of pool, with very large volumes excluded

EMF X306

Former EMFcoils impl:

Other recent JUNO+Opticks Geom Fixes

348 WP_ATM_LPMT direct inwards not outwards (MR 981)

G4: Photons escape between Lower and UpperChimney

Overlap of Water Distributor with Hama PMTs

Opticks:Geant4 photon history Chi2, reveals bugs:


Problems

  1. Opticks still single developer project, need to:
    • simplify Opticks usage
    • train more people

  1. JUNO geometry continues rapid change, eg:
    • Water Distributor
    • Water Pool PMTs still being added
    • EMF coil
    • Acrylic repairs
    • => all additions require JUNOSW + Opticks work/validation

Chronic lack of manpower working on Opticks (and JUNO Simulation)


Plans

Get Geometry Changes Merged

Automated Validation/Performance Monitoring

Production Optimization

Improve Opticks User Experience

CI/CD : Continuous Integration/Continuous Deployment