Skip to content

Propose fix for nc/idl map#284

Open
jkuhl-uni wants to merge 5 commits into
fjosw:developfrom
jkuhl-uni:fix/oqcd_fed_idx
Open

Propose fix for nc/idl map#284
jkuhl-uni wants to merge 5 commits into
fjosw:developfrom
jkuhl-uni:fix/oqcd_fed_idx

Conversation

@jkuhl-uni
Copy link
Copy Markdown
Collaborator

This is the fix I would propose for #283.
The interplay of this change and the lines 379, 380 is a bit strange to me.
In the proposed change, those lines would only play a role, if dtr_read < diffmeas and we need diffmeas to find the configuration numbers.
This would mean that, if I were to use dtr_read = n * diffmeas, I would end up with 1/n of the statistics and, obviously, a wrong idl, in which the indices are 1/n of their actual value).

@jkuhl-uni jkuhl-uni requested a review from fjosw as a code owner May 11, 2026 16:01
@fjosw fjosw requested a review from s-kuberski May 12, 2026 08:12
@s-kuberski
Copy link
Copy Markdown
Collaborator

Hi!
Thanks for reporting the issue and for proposing a fix. I will look into both (also trying to remember if I have observed something like this in the past).

@s-kuberski
Copy link
Copy Markdown
Collaborator

Thanks for looking into this!

I agree that the old implementation contained a bug and think the code change is right. There might be another issue in the documentation of dtr_read. In the current implementation, dtr_read is used as a trajectory spacing: a flow record is kept if its trajectory number nc satisfies nc % dtr_read == 0. Thus, for openQCD input with dtr_ms = 2 and dtr_cnfg = 4, one has to call the reader with dtr_read = 4, not with dtr_cnfg / dtr_ms = 2. Would you agree?

I was also worrying if the computation of the thermalization offset could create problems in the case where the number of thermalization trajectories is no multiple of dtr_read, but I think this is excluded because openQCD checks ie|=((nth%dtr_cnfg)!=0);

@jkuhl-uni
Copy link
Copy Markdown
Collaborator Author

jkuhl-uni commented May 20, 2026

Thank you for having a look at this.
I agree the documentation is not correct in this case and should be updated.

Indeed, the case you describe would be problematic, if it was possible.
It should never occur with openQCD, but then again, since // is used, the user would never know. Maybe we can find a way to implement a failsafe? Like throwing an error for the case
if not np.isclose(configlist[-1][0]/diffmeas, int(configlist[-1][0]/diffmeas),1e-12)
or so.

@s-kuberski
Copy link
Copy Markdown
Collaborator

Hi,

I agree that it is a good idea to explicitly check. One could also use the slightly more simple check configlist[-1][0] % diffmeas == 0 or even all(c % diffmeas == 0 for c in configlist[-1]) to ensure that the file is not corrupted.

In terms of the docstring, I think your fix is not quite correct. If we have a setup where

dtr_ms   = 2
dtr_cnfg = 4

in the openQCD input file, we would have to set dtr_read = 4 in this function, right? So effectively, dtr_read has to be set to dtr_cnfg if one wants to extract the flow measurements for each configuration.

@jkuhl-uni
Copy link
Copy Markdown
Collaborator Author

Yes, you are right in both points.
Towards your second point, this is true, I was sceptic at first as you could in principle only measure on every nth config, but the user gets feedback on that in line 410.
I was also wondering if line 406 would be obsolete with your test, but this actually checks the same over multiple replica (the question being, if this is too much of a constraint, such that we should only throw a warning for this?).

@s-kuberski
Copy link
Copy Markdown
Collaborator

In the (unlikely, but okay) case that one chose

dtr_ms   = 8
dtr_cnfg = 4

then dtr_read = 4 or dtr_read=8 would lead to a wrong identification of the configuration numbers (but a warning would be thrown). I fear that we cannot circumvent this behavior without adding the additional information on dtr_cnfg, because there is no other way to identify the mapping trajectory number <-> configuration number. One could add this as explicit argument that would replace diffmeas.

On the second point:
I think the existing check is was not super stringend because of the integer division, right? In case a single trajectory is missing, we would not have thrown an error.

@jkuhl-uni
Copy link
Copy Markdown
Collaborator Author

  1. Yes, that is true. I think the current implementation is enough for most people and in other cases, the user should be aware of their choice, I think.
  2. This is not quite what I meant. In line 406, np.any([len(np.unique(np.diff(cl))) != 1 for cl in configlist]), checks the complete configlist, which is a 2D array: configlist[#rep][#cnfg] such that also differences in the step size between different replica are checked. My question was just, whether this is the intended behaviour.

@s-kuberski
Copy link
Copy Markdown
Collaborator

  1. Maybe we can add something to the docstring that points out that the routine expects that a measurement has been performed for each configuration (and possibly more measurements in between but no configs without measurements)? We could add it to the description of dtr_read.
  2. That is a good question. I would say that it would be highly unlikely that the measurement frequency differs between two replica. Also our parameter r_step does not allow to choose different values for two replica. In special cases, users could still read the flow observables for single replica and combine them afterwards.

@jkuhl-uni
Copy link
Copy Markdown
Collaborator Author

jkuhl-uni commented May 21, 2026

Good. Agreed.
In fact, this is already incorporated in the doc-string, but I'll also add a word on the second point to the description of r_step to point out the second part.

@jkuhl-uni jkuhl-uni linked an issue May 21, 2026 that may be closed by this pull request
@s-kuberski
Copy link
Copy Markdown
Collaborator

Great! I think this can be merged. With the flow file that is currently present in the test directory, I could not come up with a test that could check the proper behavior, as the only variation that we can do is to use either dtr_read = 1 or dtr_read = 3 which trivially give the same result.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

dtr_read idx vs sample mismatch

2 participants