Produce better reduced data in nightly runs #231

nvaytet · 2026-01-30T08:43:19Z

In the nightly runs, we upload an artifact which contains a reduced CIF file from the Dream Geant4 simulation.
The data we used to save was:

from the endcap_forward detector (because the test was using a parametrized fixture and the last file that was written was from the endcap, overwriting files from previous banks)
using the 'small' data files used in unit tests.

As a result, there has very little or no signal, and looked nothing like a d-spacing spectrum:

Here, we use the data files with more events, and use the mantle detector bank.
We also increase the number of bins from 200 to 2000, to get something that analysis hopefully can make better use of.

This was discovered by @AndrewSazonov

SimonHeybrock · 2026-01-30T09:02:28Z

tests/dream/geant4_reduction_test.py

-def test_pipeline_save_data_to_disk(workflow, output_folder: Path):
-    workflow = powder.with_pixel_mask_filenames(workflow, [])


Is this the fixture you mentioned? If it leads to some unintentional test coupling/interaction it is worth investigating that better. What is the fixture scope? Should workflows from the fixture get copied instead of modified directly?

The problem is not coupling between tests. It is this parametrised fixture:

essdiffraction/tests/dream/geant4_reduction_test.py

Line 80 in 09041c8

def params_for_det(request):

It means that all tests depending on workflow run once for each detector name. And they will override previous files. So only the file written in the last test to run will remain.
(Note that the workflow fixture is function-scoped. So there is no issue with coupling modifications from functions.)

IMHO, the change in this PR makes sense and makes the outcome predictable.

👍 ok I misunderstood, so it was just that they all write to the same filename!

jl-wynen · 2026-01-30T12:22:16Z

tests/dream/geant4_reduction_test.py

-def test_pipeline_save_data_to_disk(workflow, output_folder: Path):
-    workflow = powder.with_pixel_mask_filenames(workflow, [])


The problem is not coupling between tests. It is this parametrised fixture:

essdiffraction/tests/dream/geant4_reduction_test.py

Line 80 in 09041c8

def params_for_det(request):

It means that all tests depending on workflow run once for each detector name. And they will override previous files. So only the file written in the last test to run will remain.
(Note that the workflow fixture is function-scoped. So there is no issue with coupling modifications from functions.)

IMHO, the change in this PR makes sense and makes the outcome predictable.

jl-wynen · 2026-01-30T12:24:01Z

tests/dream/geant4_reduction_test.py

The new test runs on all CI and locally, right? That means that we have to download the large files now. How about only running this test in nightly? Or if it's too risky with detecting broken tests too late, how about using a small file in regular CI and a large file in a nightly run?

nvaytet added 2 commits January 30, 2026 09:32

save better quality data to cif file for integration tests

fbf772b

use more bins

da430f8

nvaytet requested a review from jl-wynen January 30, 2026 08:43

nvaytet changed the title ~~Produced better reduced data in nightly runs~~ Produce better reduced data in nightly runs Jan 30, 2026

pre-commit-ci-lite bot and others added 2 commits January 30, 2026 08:43

Apply automatic formatting

1b169d1

Merge branch 'main' into nightly-reduced-data-fix

09041c8

SimonHeybrock reviewed Jan 30, 2026

View reviewed changes

jl-wynen reviewed Jan 30, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Produce better reduced data in nightly runs #231

Produce better reduced data in nightly runs #231

nvaytet commented Jan 30, 2026

Uh oh!

SimonHeybrock Jan 30, 2026

Uh oh!

jl-wynen Jan 30, 2026

Uh oh!

SimonHeybrock Jan 30, 2026

Uh oh!

jl-wynen Jan 30, 2026

Uh oh!

jl-wynen Jan 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

		def test_pipeline_save_data_to_disk(workflow, output_folder: Path):
		workflow = powder.with_pixel_mask_filenames(workflow, [])

Produce better reduced data in nightly runs #231

Are you sure you want to change the base?

Produce better reduced data in nightly runs #231

Conversation

nvaytet commented Jan 30, 2026

Uh oh!

SimonHeybrock Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

jl-wynen Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

SimonHeybrock Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

jl-wynen Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

jl-wynen Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants