Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Instrument settings - incident_angle allowable formats #104

Open
jfkcooper opened this issue Apr 26, 2023 · 28 comments
Open

Instrument settings - incident_angle allowable formats #104

jfkcooper opened this issue Apr 26, 2023 · 28 comments
Labels
enhancement New feature or request

Comments

@jfkcooper
Copy link
Contributor

I am just writing a .ort file saving algorithm for Mantis and realised that the current standard doesn't allow for (or isn't worded so it is clear it allows) a list of angles which were measured at.
At Isis it is common to measure at several angles and then stitch the datasets together (the correctness of doing this is a discussion for another time). The resulting stitched dataset will therefore be the product of several different runs taken at different angles. I wish to include these angles in the reduced data file, but as far as I can see, the allowable values are Union[value, vale range].
I propose to allow this to be Union[value, list[values], value range] instead so that a dataset which was measured at e.g. 0.3, 0.7, and 2.0 degrees can include these explicitly.
I would also say Wavelength could be similarly modified such that the wavelength ranges used at each of the angles could be included, making it Union[value, list[values], value range, list[value range]].
Any thoughts on this are appreciated

@jfkcooper jfkcooper added the enhancement New feature or request label Apr 26, 2023
@arm61
Copy link
Contributor

arm61 commented Apr 26, 2023

This seems like something that would have been thought about by @jochenstahn, @aglavic, or @bmaranville?

@jochenstahn
Copy link
Contributor

This is a topic we discussed before - but there was no decision made.....
Please have a look at
https://www.reflectometry.org/file_format/specs_discussion
at the section 'stitched data'. Please comment on the suggestions there and probably add your own.

@andyfaff
Copy link
Contributor

Do we allow for having multiple incident_angle entries?

#    measurement: 
#         instrument_settings:  
#             incident_angle:  
#                 magnitude:  0.65
#                 unit:       deg    
#                 error:      0.01
#             incident_angle:  
#                 magnitude:  3.0
#                 unit:       deg    
#                 error:      0.01

@jfkcooper
Copy link
Contributor Author

How would/could this be input?

@jochenstahn
Copy link
Contributor

I am not sure about multiple entries of the kind andyfaff suggested. Maybe Artur and Brian can comment on this?

The second suggestion in the specs_discussion page, i.e.

#        data_files:
#           - file: ....
#             timestamp: ....
#             details: 1.0 deg, 42 min

has the advantage that the angle (and other information) is assigned to an input file. One could even go to

#        data_files:
#           - file: ....
#             timestamp: ...
#             incident_angle: ...
#             wavelength: ...

This can always be done since (to my understanding) the key words incident_angle and so on are not looked up here.

Am I right that this additional information won't be used in the further data processing, but it rather helps a potential reader to understand what has been done?

@bmaranville
Copy link
Contributor

I believe there's a way to enter multiple datasets with different header information for each - this is how polarized composite datafiles work, for instance. You could put a different value of datasource.measurement.instrument_settings.incident_angle for each dataset, if you wanted to keep them separate after reduction, or you could make a column that has a value of the incident angle for each Q and R reported (we still have to all agree on how that will be uniquely identified as a column, see #70)

@jfkcooper
Copy link
Contributor Author

jfkcooper commented Apr 26, 2023

My approach is currently somewhat pragmatic, as I don't really want the saving into an .ort format to require a rework of the reduction schema, so what I have is a single workspace entity which has the stitched data, and most of the information about what went into it. My thoughts are that something like:

#    measurement: 
#         instrument_settings:  
#             incident_angle:  
#                 individual_values: 0.3, 0.7, 2.0
#                 unit:       deg 

As was previously mentioned would be a potential unambiguous solution

*edited after finding the relevant part of previous discussion - sorry

@bmaranville
Copy link
Contributor

We don't have that sort of entry in our schema, right now - how would you figure out which Q points correspond to which incident_angle values, if you have a long list of Q?

@jfkcooper
Copy link
Contributor Author

With stitched data you can't I'm afraid, though by also changing the Wavelength to be a list of ranges then it would imply the wavelengths/angles

@jfkcooper
Copy link
Contributor Author

As I mentioned, the correctness of stitching data is another matter, but at least at ISIS the angular resolution matching and extremely high wavelength resolution means that it isn't a big practical issue to stitch.

@bmaranville
Copy link
Contributor

How would you use the wavelength list + angle list to get the wavelength and angle for a given Q? I'd like to hear about how you work with stitched data (no criticism implied by all these questions - just curiosity)

@jfkcooper
Copy link
Contributor Author

If we had:

#    measurement: 
#         instrument_settings:  
#             incident_angle:  
#                 individual_values: 0.3, 0.7, 2.0
#                 unit:       deg 
#         wavelength:
#                    individual_values: [2.0,14.0], [2.0,14.0], [1.5,14.0]

This makes the accessible Q points for the three angles
[0.0047,0.033], [0.011,0.077], [0.031, 0.29]
which do overlap, but give the majority contributor at a given Q. Not ideal tbh, but I thought including the information would be better than not.

@andyfaff
Copy link
Contributor

If you're limited to two or three angles stitching isn't a problem.
It would be good to store those two or three angles in the header. It s not really any different in character to storing an angular range. I typically go back to old files to figure out how we measured the data, when we're setting up a new experiment.

We typically work with the equivalent of a QProbe and you don't need to know what the angles or wavelengths are, just q and dq are sufficient.
The only time you need wavelength and angle is if you have some sort of absorption to take care of. If that's necessary you can just use extra columns.

@andyfaff
Copy link
Contributor

@jochenstahn said:

Am I right that this additional information won't be used in the further data processing, but it rather helps a potential reader to understand what has been done?

This is mostly correct, in that we mostly use it to go back and figure out what angles we used.

However, I still think they need to be machine readable/retrievable.

It's not necessarily suitable to store them as a column.

  • the nominal angle of incidence is not necessarily the same as the actual angle of incidence for a given Q-point, due to gravity. So now you have to have two columns, to store both.

  • a given Q point may not be associated with a given angle of incidence. If a detector has high resolution and wavelength binning is fine, then individual detector pixels may be rebinned in Q. For a given detector image each individual detector pixel may have the same nominal angle of incidence, but they all have unique actual angles of incidence. This uniqueness arises from gravity effects (different wavelengths have different angles of incidence), but also from the divergence of the beam - at a given wavelength one side of the specular ridge will have a lower angle of incidence than the other side of the specular ridge. The rebinning process lumps together pixels of similar Q, but I don't think it's possible to extract an actual angle of incidence associated with each bin. In such a case you probably want to store the three angles in the header.

My thought is we want the ability to do both. In the header we'd like to store individual incident_angle (probably 2/3), but also retain the ability to have various angle columns.

@jochenstahn
Copy link
Contributor

  1. We are talking about reduced data here. Thus I assume that gravity and attenuation correction is already realised. Especially for data based on various input data sets. What is the point of applying these corrections afterwards with mixed conditions? These corrections should be mentioned in the reduction section.
  2. How the corrections and the merging was done can be mentioned in the reduction section as well. Probably with a reference to the applied procedure.
  3. For simple data sets, where there is no conflict, one can use separate columns for incident_angle, attenuation and so on. If present, this column should have higher priority than an entry in the header (under discussion).
  4. For half-reduced data one can merge various measurements into one file using multiple data_set entries. This allows to provide the correction input for each data point. I assume that the reduction software does not have a problem with several overlapping q ranges and varying binning.
  5. Still it is useful to provide information about the individual measurements in the merged and reduced data file. This will mainly help to retrace the measurement and probably to repeat it. The related quantities might be incident angle, counting time and attenuation factor. Since these are associated with one input data set I recommend to add this information to the list of data_files. And if we use the official nomenclature from the instrument settings section, these are easily machine readable (if necessary).

@andyfaff
Copy link
Contributor

We are talking about reduced data here. Thus I assume that gravity and attenuation correction is already realised. Especially for data based on various input data sets.

There's no gravity corrections applied afterwards, but the fact that there are gravity corrections and Q-rebinning can mean that it's not possible to create a column of incidence_angle, so it needs to be in the header.

  1. See 1.
  2. One could use a column for simple datasets, but it's kind of overkill if it's possible to write two/three numbers in a header.
  3. The data from the ANSTO instruments is typically constant dq/q so stitching is straightforward.
  4. Agree

@jochenstahn
Copy link
Contributor

Why is all this so complicated? We already allow entries in the header OR as an extra column, depending on the complexity of the data set. And if it is not possible to create an incident_angle column, it can not be created. No need to discuss this.

The .ort format already allows for all this. The only thing missing as far as I can see is a clear recommendation where to put extra information about the measurements which is else lost in the reduced data file. We have several suggestions already:

  1. In the data-files section as a comment:
#        data_files:
#           - file: ....
#             timestamp: ....
#             details: 1.0 deg, 42 min
  1. or in a machine readable way:
#        data_files:
#           - file: ....
#             timestamp: ...
#             incident_angle: ...  
#             wavelength: ...

(pro: - relation to an input data set, contra: - one more place where to look for this information, - difficult to automatise in the reduction software)

  1. In the instrument_settings section:
#         instrument_settings:  
#             incident_angle:  
#                 individual_values: 0.3, 0.7, 2.0
#                 unit:       deg 
  1. or with several entries:
#         instrument_settings:  
#             incident_angle:  
#                 magnitude:  0.65
#                 unit:       deg    
#                 error:      0.01
#             incident_angle:  
#                 magnitude:  3.0
#                 unit:       deg    
#                 error:      0.01

(pro: - information is where it is expected, contra: - no connection to a specific input data set).

In case I misunderstood the problem, could someone please try to explain it to me (again)?

1 similar comment
@jochenstahn
Copy link
Contributor

Why is all this so complicated? We already allow entries in the header OR as an extra column, depending on the complexity of the data set. And if it is not possible to create an incident_angle column, it can not be created. No need to discuss this.

The .ort format already allows for all this. The only thing missing as far as I can see is a clear recommendation where to put extra information about the measurements which is else lost in the reduced data file. We have several suggestions already:

  1. In the data-files section as a comment:
#        data_files:
#           - file: ....
#             timestamp: ....
#             details: 1.0 deg, 42 min
  1. or in a machine readable way:
#        data_files:
#           - file: ....
#             timestamp: ...
#             incident_angle: ...  
#             wavelength: ...

(pro: - relation to an input data set, contra: - one more place where to look for this information, - difficult to automatise in the reduction software)

  1. In the instrument_settings section:
#         instrument_settings:  
#             incident_angle:  
#                 individual_values: 0.3, 0.7, 2.0
#                 unit:       deg 
  1. or with several entries:
#         instrument_settings:  
#             incident_angle:  
#                 magnitude:  0.65
#                 unit:       deg    
#                 error:      0.01
#             incident_angle:  
#                 magnitude:  3.0
#                 unit:       deg    
#                 error:      0.01

(pro: - information is where it is expected, contra: - no connection to a specific input data set).

In case I misunderstood the problem, could someone please try to explain it to me (again)?

@jfkcooper
Copy link
Contributor Author

Having multiple entries I think becomes a little tricky from a practical perspective (maybe)

header.data_source.InstrumentSettings.incident_angle = self.angle

can't be called repeatedly to do this, and maybe we could change how we implement the writing in orsopy in order to add them from a list to be multiple entries in the header, though I think that saving 4 lines of comment by using commas would be my preference (i.e. option 3)

@arm61
Copy link
Contributor

arm61 commented Apr 27, 2023

Is the solution creating a ValueList base object in companion to Value and ValueRange?

@jochenstahn
Copy link
Contributor

A ValueList sounds good. Should we use the suggested key word individual_values?

@arm61
Copy link
Contributor

arm61 commented Apr 29, 2023

Just values?

@arm61
Copy link
Contributor

arm61 commented May 1, 2023

Or magnitudes to match Value

@jochenstahn
Copy link
Contributor

Yesterday I discussed this with Artur. He recommended not to create a new ValueList (more complexity) but to extend the ValueRange by the list individual_magnitudes (or an other name, not to be confused with value or magnitude). This is reasonable because a list of values implies a range.
If no one objects, we will realise this in orsopy.

@andyfaff
Copy link
Contributor

andyfaff commented May 3, 2023

THanks for the discussion you two had. WOuld it be possible to give an example?

@arm61
Copy link
Contributor

arm61 commented May 3, 2023

I don’t love using ValueRange for list, a range can be continuous, a list cannot. Additionally, by adding the individual_magnitudes to ValueRange the complexity of that class is increased, so there is no real simplicity gain.

@jochenstahn
Copy link
Contributor

@arm61 : Within ValueRange we have the keys min and max - both single values - and unit. So we do not use the continuous aspect anywhere yet. About the simplicity: As far as I understood, orsopy figures out which class is used by looking for certain key words. For ValueRange that would be min and max. The (small) increase in complexity is cuased by also looking for individual_magnitudes if this belongs to a separate class.

@andyfaff : In the end it would look like discussed above (plus brackets):

#         instrument_settings:  
#             incident_angle:  
#                 min: 0.3
#                 max: 2.0
#                 individual_magnitudes: [0.3, 0.7, 2.0]
#                 unit:       deg 

@arm61
Copy link
Contributor

arm61 commented May 9, 2023

@jochenstahn, I think I am more worried about the complexity from the orsopy user side. With including the individual_magnitudes into the ValueRange object, the documentation become less clear, i.e.,

@orsodataclass
class ValueRange(Header):
    """
    A range or list with mins, maxs, and an optional unit.
    """

    min: float
    max: float
    individual_magnitudes: Optional[List[float]] = field(default=None, metadata={"description": "If ValueRange is being used to describe discrete values, they can be defined here"})
    unit: Optional[str] = field(default=None, metadata={"description": "SI unit string"})

Feels like a nuanced point to make, where we could instead just add a ValueList class.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants