I discovered your forum today, thank you very much for all the work you do, it is awesome !
I would like to know if anyone knows how to decipher the XML format outputted when surveying the volumes of a plate using an Echo 525 machine, here’s the type of data we get:
<platesurvey name="384PP_AQ_BP" barcode="UnknownBarCode" date="2023-10-25 12:32:16.000" vtl="0" original="1" frmt="1" rows="16" cols="24" totalWells="384">
<w r="0" c="0" n="A1" vl="41.024" cvl="41.024" status="" fld="AQ" fldu="" x="56005.004" y="71691" s="100" fsh="100" fsinh="100" t="3.163" ct="3.163" b="0.535" fth="3.163" ftinh="3.163" o="0" a="NONE">
<e t="AVG" x="56005.004" y="71691" z="17717.291">
<f t="FW BB" o="31.08" v="1.812"/>
<f t="FW TB1" o="31.616" v="1.562"/>
<f t="SR" o="35.876" v="3.984"/>
<w r="0" c="1" n="A2" vl="39.744" cvl="39.744" status="" fld="AQ" fldu="" x="60486.897" y="71691" s="100" fsh="100" fsinh="100" t="3.071" ct="3.071" b="0.532" fth="3.071" ftinh="3.071" o="0" a="NONE">
<e t="AVG" x="60486.897" y="71691" z="17717.291">
<f t="FW BB" o="31.062" v="1.796"/>
<f t="FW TB1" o="31.594" v="1.625"/>
<f t="SR" o="35.73" v="3.984"/>
Some parameters of nodes
w are quite straight forward,
n= well name, but the rest looks pretty obfuscated to us. We guess that
vl/cvl are the volumes that were measure by the Echo.
Would you have more information about this format or a formal specification of it?
Thank you very much in advance!
Hello @cosmo . I recognize that one of your motivations in posting this may have been to keep me from needing to spend too much time on it. But it’s an interesting question! And for this, we may as well discuss it here, where others with more knowledge might jump in, or what we fumble around with might be useful.
EW BB and EW TB only show up for empty wells on my surveys. I’m ignoring those elements for now, but I suspect they’re largely similar to FW BB and FW TB1.
I’ve only looked at the CSV vs XML output so far. I’m using the following very rough code to parse the XML, rather than writing something custom:
dat = pd.read_xml(filename)
fwbbdat = pd.read_xml(filename, xpath="//f[@t='FW BB']")
fwtb1dat = pd.read_xml(filename, xpath="//f[@t='FW TB1']")
srdat = pd.read_xml(filename, xpath="//f[@t='SR']")
fdat = dat[dat['vl'] > 0].reset_index()
fdat and the others seem to safely have the same row indices this way.
Some of the values are reasonably easy to discern:
- Recall that it’s the transducer/nozzle that’s moving here.
x, y can be interpreted as that position. It seems like the position is roughly in µm, but the values aren’t exactly [0, 4_500, - 4_500] as I’d suspect, so either it involves some calibration, or there is something slightly unusual. In our case
dat.loc[:, 'x'].diff().round(0).unique() gives
[4482., -103077., -103090.], and those two latter values
/23 give roughtly 4482. This is consistent with 4.5mm distances between wells.
y has similarly expected values. This is fun, for example:
plt.scatter(dat['x'], dat['y'], c=dat['vl']).
vl == cvl for all of my (obviously AQ & no surfactant) samples. I assume this has something to do with fluid type.
b is described as “Bottom Thickness (us)” in the csv. It is also equal to
fwtb1dat['o'] - fwbbdat['o'], up to different rounding, so I would assume that the FWBB o is the time of flight to the bottom of the plate, and FWTB1 o is the time of flight to the beginning of the well (end of the plastic).
- BottomEchoAmpRatio in the CSV equals
fwtb1dat['v'] / fwbbdat['v'], so I expect those values are related to the amplitude of the echo from the bottom/top of the plastic.
- ftinh appears to be around a constant multiple of SR o - FW TB1 o, possibly up to rounding. fth is usually == to ftinh, but oddly jumps to around 2.5× in some wells. ftinh makes sense as a ToF value in the fluid, in part because
- vl appears to be close to linearly dependent on ftinh, but not exactly; it seems to differ more for low-volume wells. I’m missing something here.
These appear to be the tags and attributes, at least those I’ve seen so far:
- (root element) platesurvey: the plate survey
- name: plate type
- barcode: barcode. If not found, “UnknownBarCode”.
- date: mostly ISO 8601; space instead of T, goes to ms with dot, no TZ specified.
- serial_number: machine serial number
- vtl: unknown
- original: unknown
- frmt: data format. “1”
- rows, cols: number of rows and columns in plate
- totalWells: total number of wells in plate
- (subelement) w: Well
- r, c: well row and column. 0-indexed.
- n: well name, standard format, no padding/leading 0.
- vl: volume
- cvl: current volume. The distinction here is likely related to surveys done during transfers.
- status: empty string for success, otherwise info
- fld: fluid
- fldu: fluid units
- x, y: meniscus center coordinates
- s: unclear
- fsh: homogeneous DMSO?
- fsinh: inhomogeneous DMSO?
- t: fluid thickness? µs
- ct: current fluid thickness? µs
- b: bottom thickness of well. µs
- fth: homogeneous fluid thickness? µs
- ftinh: inhomogenious fluid thickness? µs
- o: outlier?
- a: corrective action?
- (subelement) e: Echo signal
- t: type. Either AVG or RAW.
- x, y, z: transducer coordinates
- (subelement) f: individual Features
- t: type. String of:
- FW BB: bottom of plate plastic at well?
- FW TB1: top of well plastic at well / bottom of well?
- SR: meniscus?
- other values?
- o: time of flight. µs
- v: peak-to-peak value
A significant warning here: it turns out that’s the format for surveys by “Echo Liquid Handler”/Medman. Cherry Pick, for example, uses a completely different format. I expect Plate Reformat does too. So the intra-protocol surveys those take will be need to be handled differently.
The CP/presumably-PR formats are at least much more self-documenting, even if they seem bizarrely reduntant (each tag also has a name attribute that is essentially the tag name with spaces).
I assume the format above is the format being sent by the SOAP interface, and CP/PR are doing something with that.
I have a rudimentary Python library to parse both formats now, and some other Echo-related files. The name might change; it is also very rough at the moment. It can, however, at least pull volumes out of both formats and provide them in convenient forms (numpy and polars).
I’m putting information on the raw formats in the documentation as well.