PROCBLK new record + SDFMODE extension


Etienne JULLIEN

Recommended Posts

Greetings,

Here is a formal proposition for the new PROCBLK record we discussed during last VEE, as well as a proposition to extend the SDFMODE to more protocols.

Basically, PROCBLK works the same as PROC, but enables declaring whole blocks of device-specific data. Thus, dataset records such as TRCFMT could be included in such blocks.

The SDFMODE modification is just an extension of the syntax allowing more protocols (if we don't do this, we have to make modifications for each new protocol, such as https, for example, which is not allowed in the current state).

See you soon!

Regards,

Etienne.

 

 

 

VEE 2018 - Proposal of DCS records evolution - Essilor 20180201.pdf

Link to comment
Share on other sites

A few comments:

PROCBLK/PROCEND :

I realize that the intent is for this to only be used communicating to the LMS, but it's important to be aware that such types of data often do find themselves sent to devices as well. Usually because some LMS in labs are limited home-brew systems, or light systems that do very little logical processing, or even labs don't have a working VC Host server but rather just feed devices with text files containing DCS-style data.

So it's entirely realistic to expect that some devices will receive... everything, and will need to sort it all themselves. In this case, to parse this data blocks.

For devices already aware, this is of course easy. Just ignore everything PROCBLK indicates is for a different device, and prefer/override with a matching block. The problem is what can happen with older devices that don't support it yet. Such devices will ignore the unknown labels (PROCBLK/PROCEND) but will try to process familiar labels inside the blocks.

I think that the best way to avoid these problems is to still have all data inside the blocks pass using another label. That would be a trivial change for anything new that will be made to handle these blocks, but will make older/non-supporting devices ignore the data.

e.g. instead of (to copy an example from the document):

PROCBLK=FSG;;HSC
LMATID=70;70
_SCSBRD=2.50;2.50
_SCSBCCA=0;180
GAX=0;0
PROCEND

maybe use something like:

PROCBLK=FSG;;HSC 
P=LMATID=70;70 
P=_SCSBRD=2.50;2.50 
P=_SCSBCCA=0;180 
P=GAX=0;0 
PROCEND 

This would probably require that label to be exempt from the 80-character limit (since some of the included data may be there already), but I expect that this limit isn't really an issue for any system that still keeps up to date with new DCS functionality anyway (or is that wrong?), and there are already some exempt labels in the standard.

SDFMODE :

I understand the benefit of the LMS value when sending data from the LDS to the LMS, so the LMS will know not to transfer LDPATH as-is to devices but to inline it (though, as an aside, does that really happen anywhere? I don't think I ever encountered an LMS that just sent to our devices SURFMT and ZZ/etc... labels in the main data packet instead of provide a file listed in LDPATH ).

But I don't really see the benefit of any of the other potential values, given that they will all anyway need to be combined with the identifier in LDPATH, and that the LDPATH identifier anyway cannot be interpreted correctly without knowing it.

That is, what is the practical benefit of something like

SDFMODE=ftp://server:port/dir/subdir/
LDPATH=file.sdf

over just

LDPATH=ftp://server:port/dir/subdir/file.sdf

regardless of SDFMODE? Why is a file retrieved from an http or ftp URI different from a file retrieved from a local file system or an SMB URI? I completely agree that specifying the access method as a fixed set of value in SDFMODE is not a good idea, and would require to make needless modifications for each new access protocol. But I don't see why specifying anything beyond the current send-location/send-inline differentiation is needed when LDPATH is already capable of (and practically needs to) holding the entire protocol and location.

(e.g. local file access is done with a different protocol than network access over SMB/samba shared folders, but I really don't think anyone here would suggest that LDPATH can contain "c:\folder\file.sdf", but over the network LDPATH should just contain "file.sdf" and SDFMODE should contain "\\server\share\folder\" , right? It makes sense to handle this as it is currently, SDFMODE assumed FILE and LDPATH "\\server\share\folder\file.sdf", even though it's using a different access protocol behind the scenes. It shouldn't be different when the URI schema starts with "ftp://" rather than "\\" , semantically they're about the same in this context)

If anything, the first case (splitting the protocol/location) is potentially worse, because if it happens to reach a device that doesn't support the mode (e.g. FTP ) and doesn't check SDFMODE (because it's been mostly useless for most devices so far), there is a chance that instead of throwing an error it will process the wrong data (like reading a file from a default shared network folder, for a lab that re-uses job numbers).

And there isn't really a benefit for the first case, since otherwise in any case the first can work, the second can as well, and any case the first can't work, the second can't as well.

There isn't any suggestion to add mode/access-protocol negotiation (and I think it would be a mistake to add one), so in any case the receiver of the data (regardless of whether it's a device or and LMS) will need to handle whatever it receives. So why split the single data point (location of the surface data) into two parts?

 

Also, SDFMODE really don't play nice with multiple LDPATH records. And while it's true that ideally PROCBLK blocks can replace those entirely (so each PROCBLK section can have SDFMODE and LDPATH), these two things (vendor/device data blocks, support for more surface file acquisition modes) can theoretically be separate, but putting back the focus on SDFMODE just entangles them (can't support multiple different SDFMODE without also supporting PROCBLK. But absolutely can support multiple LDPATH where some are local files and some are http or ftp or whatever, without supporting PROCBLK).

Link to comment
Share on other sites

Many thanks for the extensive and relevant comments, Yaron.

Regarding PROCBLK, as you pointed out, this is a LDS=>LMS record, exactly as the PROC one. It is never supposed to make it to the actual device. I understand your point, though, but I'm pretty sure the issue you are raising  with older system letting those labels flow up to the device should be handled during deployment and configuration phase. Basically  the LDS should be configured not to use such records in that specific case.

The idea you suggest (inserting a fake label in each line within a proc block) could work as a safeguard, too, but I'm not sure it would be as easy to manage by LMS as the current proposition. We can discuss this with LMS teams, of course.

Still, the absolute target is that devices should never get such records.

Regarding the SDFMODE, the point is that depending on the mode used, the identifier sent in LDPATH tag might not always be a filename. And might not be inserted directly in the SDFMODE record content, or not always in the end (for http requests, for example). Basically, the idea is to have SDFMODE contain static data, and the variable part in LDPATH. In some protocols, such as ftp or http, the device might also have to add credential informations (login/password, etc.) that we don't want to specify in the LDS data obviously. So the SDFMODE value will be completed / altered anyway in those cases, which is why it does not sound too bad to have the LDPATH contain the variable part (file identifier / name).

 Basically, the final URL / Path to access the surface would be composed

  • The SDFMODE static part
  • THE LDPATH variable part
  • Possibly another static configured part (depends on the SDFMODE + LDVEN)
Link to comment
Share on other sites

Hi Etienne, thanks for the response.

Regarding PROCBLK, I fully agree with all of your "should" handled otherwise and never sent to devices. But from past experience other labels that similarly "should" be handled differently and not sent to devices, do sometimes get sent to devices. So since this sort of thing does happen, it seems reasonable to assume that it will happen again. Certainly it seems like a more likely assumption than that it will indeed never happen in practice.

The DCS can try and go very strict on this, define "must" explicitly and more strictly (and change some current uses of the word which are currently more like common usage than a standard-defined term), and then explicitly state that these blocks MUST never be made available to device. That would be a much larger overkill, and much more difficult to assure (since it can be a setup issue rather than an individual LMS/device, in which case the person in the lab that does that doesn't even directly uses the DCS and it can't be enforced), than simply doing something like the suggested additional label for such embedded data.

To be clear, such a label isn't the only way to do that (and certainly instead of "text copy of another label" it can be a "variable field count label where first field is a label and the rest are fields for that label" if it seems more in spirit with the DCS), just what seemed to me as the quickest/easiest with least amount of extra data/work/changes. The only important criteria are that it will be very clear to aware devices, and will be ignored by default by non-aware devices. As far as the LMS side is concerned, the majority of the effort would be to handle context-awareness where now labels that didn't care about order are now in a state where the order (after/before PROCBLK/PROCEND) matters. So whatever is inside the block will need to be treated/read/parsed differently anyway. I'd be surprised if stripping it from an internal label would be a significant extra effort.

 

On SDFMODE, there are two related points, which are why I think keeping it all in LDPATH is still the correct thing to do:

1. This is a general thing, but related to the treatment specific here: There is no intrinsic meaning to the term "file" other than as a distinct defined data source. There is no coherent general definition for "file", that isn't dependent on individual implementation and file-system/index.

D:\folder\123.SDF could be a chunk of a few sectors on a local hard drive, in locations indexed by the file system on that drive, and reading it will read the data from these sectors and return it. Or it could be on mapped network drive and reading it will send a network request that will return a bunch of data. Or it could be mapped to an ftp server, in turn asking that server to send a file. And that server may not have "real" files, it can be an ftp protocol front-end to a different type of database, or to a program that generates the data on-the-fly. Or one of great many other things.

All of these would be entirely transparent, and meaningless, to whatever program tries to read that "file", because all it wants is to get the data, where and how it is stored doesn't matter to it in any way.

Similarly two URIs like

http://server:port/files/surface/vendor1/hi/123.sdf
http://server:port/calculate-surface-now.php?job=123&vendor=vendor1&res=hi

are exactly identical for a program calling one of them to get the data. They'll receive the exact same thing back. In fact the server may be configured to accept one of them and automatically convert it behind the scenes to the other and return the response from the other.

Which means, in our context, a type of "FILE" just means the ability to get some sort of reference (file name/path, web-URL, ftp-URL, all and any other URI type) that can be read, which when read should return surface data. Differentiating from the current alternative option of "LMS" which indicates that the surface data is being sent inline with the rest of the job data (which, again, I'm not sure is actually used anywhere, but it doesn't really matter for the sake of this discussion). Anything which is "this is where you have to read in order to get the surface data" is a surface file location regardless of whether a surface file even exists anywhere.

2. There is, at least as far as I can see, no practical usage scenario where splitting to location of the "file"/data-source matters.

Because if the final location isn't just a direct combination of the two parts, then the device receiving these will need to know the final structure and be able to modify it. And if the device anyway needs to know how to modify the URI given, regardless in how many parts it's given, then there is no benefit in splitting it.

That is, in other words, if the LMS knows how everything is combined, then it can combine it. If the LMS doesn't know how everything is combined, then the device needs to know how to parse/modify the URI anyway, so giving it in two parts doesn't help.

Also, as an aside, by definition LDPATH, having that "path" part of the name, seems more fit to hold a URI (which path is a type of), rather than SDFMODE, which seems named fitting to hold a mode rather than full location structure. And re point 1, "that there is a location where the surface data is at" is fitting for a "FILE" mode, both labels working as-they-are correctly.

-

 

To illustrate point 2 above, a sample of what I think you're aiming for, and why splitting over two labels doesn't help:

To be clear, this is absolutely the "advanced"/"complex" case, which is where I think you see the need, simpler cases are of course simpler and aren't different from what is already done now with LDPATH.

Say the end result is something dynamic, which I assume is the case you're working for supporting, that requires additional modification, like the aforementioned

http://server:port/calculate-surface-now.php?job=123&vendor=vendor1&res=hi

If I understand you correctly, what you want to do is have the LMS send something like:

...
LDPATH=123
LDVEN=vendor1
...
SDFMODE=http://server:port/calculate-surface-now.php?res=hi
...

And then have someone configure each and every device to know that if SDFMODE matches that specific pattern (because for other vendors the patterns may be entirely different with entirely different SDFMODE format) then it should GET it adding a parameter called "vendor" which will have the value of LDVEN, and a parameter called "job" which will have the value of LDPATH.

And that... would work. But :

  1. It's special configuration for all devices for each type of SDFMODE, and if you have multiple devices you need to make sure to configure all of them the same
  2. All those "magic" values (like LDVEN being sent as parameter "vendor") are completely invisible to a person trying to maybe diagnose a problem and going over communication logs, you have to be aware of the specific format to make sense of it. Which is fine for some custom data, but seems ill advised for something very basic and common like surface data.
  3. Isn't different from sending "LDPATH=http://server:port/calculate-surface-now.php?res=hi&job=123" and leaving the device configuration to know that LDVEN should be sent as "vendor" for this configuration. Or, since we're doing custom configuration of labels on the device anyway, sending the same SDFMODE in LDPATH as "LDPATH=http://server:port/calculate-surface-now.php?res=hi" , and having the device know to use JOB, or current job number, or some other label for "job" parameter. It's the exact same data, allowing and requiring the exact same type work on the device, only without any artificial splitting.

    Because, again, if dynamic structure that is different between vendors/whatever (so for you needing different SDFMODE) isn't needed (can be just one SDFMODE) then no point in splitting. And if dynamic structure (you will need different SDFMODE values for different cases) is needed then it's going to be dynamically parsed on the device anyway, so the split is, again, not needed, the "dynamic" part is going to be inserted into the "static" part, meaning that the "static" part isn't "static".

-

If a vendor came to be with that specific scheme, where individual labels need to be placed in different locations with different names, I'd tell them that if they need it dynamic based on additional labels then they should make it explicitly dynamic, at which point no need to configure all devices, and the LMS or lens vendor can make changes whenever they want. So it's enough to send any of (just as a list of samples, varying in structure, varying on whether the identifier that you would place in LDPATH as the dynamic part is another label or not):

LDPATH=http://server:port/calculate-surface-now.php?res=hi&job={JOB}&vendor={LDVEN}
LDPATH=http://server:port/surface/{VENDOR}/hi/{JOB}.{LDTYPE}
LDPATH=ftp://server:port/surface/{VENDOR}/hi/123.sdf
LDPATH=C:\Data\{LDVEN}\{JOB}R.xyz;C:\Data\{LDVEN}\{JOB}L.xyz

Whoever puts the scheme into the LMS already knows where they need "their" surface files access, and so how a single URI for a single surface source is to be created. So no need to add opaque configuration on the device. Anything sent as a pair of "static"+"dynamic" data labels, can be sent as a single data label.

(I mean, really, this part of the discussion is all an overly complex way of saying that you can easily send "A+B" together instead of sending "A" and also "B" separately, especially where there is already a label dedicated to sending both A and B in, which it has been doing successfully for a while, and so can work even for sending "A,B" or "A-B" or "BA" )

Link to comment
Share on other sites

Just as a quick addition, since you explicitly mentioned example of username/password which I didn't explicitly responded to:

This is the same as any other type of data, except that one *does* have to be configured on the machine. But usage is the same, there is no benefit to splitting the URI between LDPATH and SDFMODE. Imagine, again, an ftp (or http, same format) URI.

Option 1, splitting:

SDFMODE=ftp://server:port/folder
LDPATH=123.sdf

which the device needs to convert to ftp://username:password@server:port/folder/123.sdf .

This is not different from getting

LDPATH=ftp://server:port/folder/123.sdf

and turning it into the exact same ftp://username:password@server:port/folder/123.sdf .

Not only that, but notice that in this case, which is your own example of the need to split to a "static" part in SDFMODE, in fact the data in SDFMODE is not static. Because the device took "static" data of "ftp://server:port/folder" and changed it to "ftp://username:password@server:port/folder" . Calling that one "static" is not strictly or necessarily correct.

Link to comment
Share on other sites

Greetings Yaron, and again, thank you for your time!

Interesting comments.

Ok for the PROCBLK, I keep your suggestion in mind, let's discuss this during the DCS meeting next week.

Regarding the SDFMODE, if I understand correctly what you are saying, you'd rather see this record completely removed and replaced with a full URI in LDPATH. The suggestion here was trying to stick to the original design (SDFMODE telling which protocol to use, LDPATH hosting the surface identifier). But this seems to be a valid option, and I propose that we discuss it next week, too.

Link to comment
Share on other sites

I won't be in the DCS meeting, which is why I'm trying to be very clear and detailed about the comments, so all the points will be available for the people involved in the discussion. (plus, well, I think it's a very good idea in general to consider things in advance, without the time limit of a live meeting, and so it's unfortunate that it seems nobody but you raised any planned discussion point here in the forums before the meeting)

Regarding PROCBLK, yes, I think it's well covered at this point. Thank you.

 

On SDFMODE... almost.

  1. Yes, I think full URI should be in LDPATH.
  2. No, I don't think SDFMODE should be completely removed.
    Unless, that is, the decision in the meeting is that the LMS value is unused and should be deprecated.
    In that case, yes, I do think SDFMODE can be completely removed as well.
    The functionality of it... is not stating the protocol to access LDPATH, but rather the mode of operation with the surface data referred to by LDPATH (FILE: on Host copy LDPATH as is, on device read surface data from LDPATH. LMS: on Host read surface data from LDPATH and send inline, on device read surface data inline and disregard LDPATH)
  3. No, I disagree that this position is a change of the original design/intent for SDFMODE and LDPATH, and I disagree that SDFMODE was originally intended as a way indicate the "protocol" to use. I claim that this change (having full URI for surface data in LDPATH, regardless of where the data is coming from) is very much in accordance with the original intent and definition.

 

 

-

 

That third point (not an actionable item itself, but probably a valid consideration when deciding about the actionable items) may need some explaining. These are what I see as strong indications that putting full URI in LDPATH, while keeping the same FILE value (or removing altogether) in SDFMODE, is as per the original design, and that specifying protocol in SDFMODE isn't:

  • SDFMODE currently has 4 possible types of values. Two of which (LMS, FILE) are very clearly defined. The other two have a lot of vague/open parts in their definition (which I assume you agree with, as it seems a part of the reason you wanted to improve things). So for figuring "original intent" the better defined parts should weigh more.
  • The two better-defined values don't actually specify a "protocol". So SDFMODE can't have been intended to specify the protocol of LDPATH.

    FILE isn't a protocol. Notice that right now it can be used (and often used) for local Windows file system paths (NTFS, FAT, ... ), or for local Unix/Linux file system paths, or for network shares using samba/SMB protocol (which is pretty much assumed as built-in for Windows, but certainly isn't/wasn't for earlier Unix/Linux systems and would have required special handling from such computers, or other devices). So already there are several different protocols that can be used to access surface data from LDPATH, none of which are explicitly named.
    So current uses of SDFMODE+LDPATH are, at about 100% of them, taking protocol data from LDPATH.

    LMS also isn't a protocol for how to parse LDPATH either. For a device receiving it, it means LDPATH isn't used. For an LMS using it... it means that LDPATH still contains a file source data path, but without any data on how or what it is.
  • While I didn't see the original version of the DCS where SDFMODE was added in, and I wasn't involved in the discussions prior, I feel very confident (but please correct me if I'm wrong) in assuming that LMS and FILE were the main (or even only at first) values that the VC committee wanted to include, and the ftp/http mentions were added later or as a rough afterthought for "just in case we'll need something like that in the future".
  • When sending data LDS->LMS it's possible (as per current definition of these labels) that SDFMODE will be LMS and then LDPATH will still contain a URI to the file path (in fact it's practically required). The Host is required to get the surface data from LDPATH, even though SDFMODE doesn't specify any "protocol". So the intent *can't* coherently be for the protocol to be in SDFMODE.

    The Host reading surface data from LDPATH (when SDFMODE=LMS) needs *exactly* the same info that a device needs to read surface data from LDPATH (when SDFMODE is something else), they're doing the exact same thing at that point. But for the Host... SDFMODE is already used.

    e.g. If on LDS->LMS communication SDFMODE=LMS, then it's possible that LDPATH=\\server\share\file.sdf , or that LDPATH=/mnt/somwhere/file.sdf or LDPATH=c:\folder\file.sdf . Without the FILE value being anywhere. But what if a new/future LDS will *still* want the Host to forward the data as in SDFMODE=LMS, but will share it files with the LMS/Host through an ftp server? It *can't* send both SDFMODE=LMS *and* at the same time SDFMODE=ftp://...  . So it must send SDFMODE=LMS, and then LDPATH=ftp://...  .

    Unless you're suggesting to either A) add another label, so there will be one for the LMS/host forwarding mode (how LDPATH could be used now, according to the definition, with FILE/LMS) and another for the protocol used by LDPATH . Or B) Remove the LMS value from SDFMODE while expanding it for protocols.

    None of these matches the claim that SDFMODE was originally intended to be used for protocol of LDPATH, since one of these is absolutely *required* to send "protocol" outside of LDPATH while still keeping other original value (LMS) usable.
  • As mentioned before, LDPATH is already defined as can contain a fully qualified path. Which is a term that covers several types of a complete/full URI. So obviously LDPATH can contain a full URI, protocol signifiers included, as currently defined.

 

So, I don't see any problems with original intent with just removing the current ftp/http options from LDPATH (or dropping it if LMS isn't needed) and only using LDPATH for surface source reference. The changes needed are pretty small, consisting of:

  • Changing LDPATH definition to state "URI" instead of "path" (with matching syntax/phrasing changes)
  • Adding a comment that actual supported protocols/formats will be dependent on agreements-with/support-by devices.
    This is the status today *already*, considering samba network shares aren't 100% supported out-of-the-box on everything already, and different O/S on devices will need file paths in a different format anyway.
    So that's not really a change but rather an already implicit assumption that isn't written in the DCS even though it should be.
    And it's about similar to what will need to be added anyway, just to SDFMODE on your original suggestion.

That's simplifying things, and (as I hope I managed to express) keeping with original intent. It's also a change that "automatically" makes things future-compatible for future protocols, since it removes the current explicit mention of just several protocol, without explicitly mentioning any new ones.

Link to comment
Share on other sites

Yaron:  I wanted to post this follow-up based on our discussions from the DCS meeting today so you'd have a chance to offer your thoughts.  Your insight on this proposal was extremely valuable and helped the group avoid taking a wrong path, so thank you very much! 

To members of the group please feel free to correct me where I've gotten any details of the consensus wrong.

The group agreed to essentially adopt a modified version of your prefix label proposal:

PROCBLK=Prefix;DEV;[VEN];[MOD];[SN]

Where "Prefix" is an integer value.  Instead of using "=" as a delimiter for the prefix the group settled on "!" (not already in use, not reserved, unlikely to be strongly desired to be part of a real record label name).  By using the prefix the PROCEND record is no longer needed (all [prefix]! records belong to that PROCBLK).  Examples of this PROCBLK format that the group came up with during the meeting would look like this:

PROCBLK=1;FSG;;;
1!LMATID=70;70
1!_SCSBRD=2.50;2.50
1!_SCSBCCA=0;180
1!GAX=0;0
PROCBLK=2;FSG;SCH;HSC;
2!LMATID=72;72
2!_SCSBRD=2.50;2.50
2!_SCSBCCA=0;180
2!GAX=0;0

The standard will require PROCBLK to precede any prefixed records and that the prefixed records be contiguous.  This would be similar to the other dataset definitions.  Do you see any issues with this solution?

Link to comment
Share on other sites

Paul, thank you for that, it's good to have followup.

(And I think can be a good idea in general to maybe post all topics post-meeting, to give even the people who were involved time to think things through more than is possible during a live meeting).

Generally this seems good, just a few thoughts:

  • The big, and only, benefit I see with having a variable/dynamic prefix per block is the ability to ignore the ordering requirements, since then the records then don't really have to be contiguous, and it's obvious which in-block line belongs to which block. But if they have to be contiguous and follow the PROCBLK record, like other datasets, then the different numbers add complexity with no benefit. In such case a fixed prefix (e.g. "B!" or always "1!", or whatever) would do just as well since anyway there can only be one active block at a time (same reason for existing datasets none of the "following" labels bothers to identify the first one it relates to, they're already strictly related by the order).
    Adding complexity for a potentially useful feature (and reducing the amount of exceptions to the no-order rule) is very valid, but adding it and then explicitly not using it and its benefit... isn't so much.
  • Maybe it can be better to avoid having a new element (and separator) add to the "record" type, and instead just add a label for block content that accepts as first two fields the block identifier (prefix currently) and original label name (e.g. instead of "1!LMATID=70;70" use "B=1;LMATID;70;70") ?
    That avoids the need for a special record element and separator type entirely (no need to touch the definition of a record), in exchange for just one other label. Same functionality, about same amount of effort to parse.
    Is the "there's a new element type in a record" change not a bigger one than adding one more label?
Link to comment
Share on other sites

You are as astute as always... I almost posted an pre-explanation of why we didn't just do away with the prefix altogether.  There was a lengthy discussion about it and your summary of why we kept it is spot on.  The convenience factor outweighed terseness.  At least I feel that is a fair assessment of the result. 

For the alternate format you proposed, it seems less clean to me than what we ended up with at the meeting but I'll let others comment. The version above will be put into the 3.12 draft for comment as well. 

Also, related to 3.12, the SDFMODE/LDPATH proposal has been postponed until a future draft so further work can be done.  A revised proposal will be forthcoming.

Link to comment
Share on other sites

That's a good point, Yaron.  

I'm a little torn on this, as it's entirely true that we have a number of other datasets that are necessarily ordered and contiguous.  

TRCFMT
RR=
RR=

There's value in following that schema.  Though, I think there'd be something cleaner about using a special character to indicate belonging to a particular PROCBLK, and the '!' seems handy for this, as I'm not aware of it being in use.

PROCBLK=FSG;SCH;HSC
!LMATID=70;70
!ETC=1;2

Where there wouldn't have to be any restructuring of the contents of the block, as the line is otherwise correct.  Looking, the pipe character is a reserved character, and doesn't currently have a definition when used in a name, so that might be a better alternative if we want to maintain absolute compatibility.

One other thing that might combine the best of both worlds - we've been using the tag for PROCBLK to identify an upcoming block...what if we didn't use PROCBLK, but simply had the machine prefix on the tag.

FSG;SCH;HSC|LMATID=70;70
FSG;SCH;HSC|ETC=1;2

Pros
1) These device-specific records don't have to be contiguous.
2) Ease of knowing which device a line is targeting.
3) Backwards compatible.

Cons
1)  More verbose.

 

Link to comment
Share on other sites

But that last idea is not very different from a PROC record, is it?  I suppose the presence of a semi-colon in a Record Label would suffice to identify the record as machine-specific.  Given that PROCBLK was a form of shorthand for PROC records (and a reasonable way to handle PROC datasets), I don't think this would be a great approach.

Link to comment
Share on other sites

One difference is that it will allow for datasets.  

A common thread I've heard (outside the forum) is that it's a bit verbose for the task.  I'd argue that computers will be able to quickly generate or consume the extra text, but that when a human has to examine the OMA data to determine, for example, why a particular device is receiving a given LMATID, it'd be much faster to search through that text for "LMATID" and run through the results and immediately determine which device-specific line applied.  That's as opposed to finding a LMATID line with a particular prefix and move back up to find out what to what PROCBLK that line belongs and getting the device there.  

Now it may be that's not a use-case that's of great concern to folks, and to help determine if I'm imagining a problem, I've attached three examples LMS files that use the three main recommendations for PROCBLK (or lack of it).

So, what's the LMATID for a FSG-OPT-VHZ in each of these files?

ProcBlockUniquePrefix.txt

DeviceSpecificPrefix.txt

ProcBlockRequireContiguous.txt

Link to comment
Share on other sites

What would be the difference between this and just refactoring PROC to support data sets?  I mean, other than requiring more characters ("PROC=" for every line) it would seem to be the same thing and to me is cleaner since it's in keeping with the existing record formats in the standard.  We would simply need to indicate that PROC records for data sets must be contiguous.  I'm not fan of that solution as I felt the prefix idea from the meeting was clean and very simple but if we're going to go with this sort of approach I would think just tweaking PROC would be a more homogeneous solution.

Link to comment
Share on other sites

I can remember of 2 benefits having the prefix:

- It makes it easier to scroll down a DCS packet and visually differentiate the blocks (as opposed to being very cautious and try not to miss a PROCBLK record when several blocks are following one another).

- It might ease the LDS 2 LMS support to have a block identifier (as opposed to finding it by looking the the specific device / vendor / model / SN).

But honestly, as long as we can agree on a solution that allows us to have device-specific dataset records , I'll be 100% happy. :)

Link to comment
Share on other sites

The "have all the details in every line" is indeed not different from PROC. And since the whole idea behind PROCBLK was to avoid listing all the details over and over per each label, and since the discussion wouldn't have gotten this far if there wasn't general agreement among members that it's a good idea... this one probably shouldn't be it, it's just a more complex way of saying "no, just using PROC is fine". Which, again, it's considered to not be.

 

About the unique prefix/identifier or not, again I see the main benefit as the ability to not keep things in order. So if it seems like a good benefit then having a prefix/identifier and no order requirement is the way to go, otherwise not having a per-block identifier is probably better.

Why I don't think having a per-block identifier, while still keeping order, is a good idea:

  • Technically, for machines/devices/software processing the data it's entirely useless. This is additional field, and data, and processing complexity, that serve no technical purpose, since if the labels are ordered then it's already completely known without being listed.
  • Regarding:
    2 hours ago, Etienne JULLIEN said:

    It makes it easier to scroll down a DCS packet and visually differentiate the blocks (as opposed to being very cautious and try not to miss a PROCBLK record when several blocks are following one another).


    That may be true, but... well, why would the support/technical people reading the lms files (or communication logs) using notepad instead of one of the very many better text editors that can provide syntax highlighting? There are even lots of fantastic free ones. And they make life much easier for a lot of reasons.
    I mean, using a copy from the example file provided by Steve, do you really have any difficulty locating PROCBLK here:
    vca-text-procblk-1.jpg.171ef21043dcfb7bfb47b57814582c7f.jpg
  • Or, alternately, same topic, if the purpose of the identifier is only to be human-visible anyway, why not instead specify that any PROCBLK label must be preceded by a few REM ones? That would create a very distinct visible break all by itself, while adding no complexity/logic for any device having to parse the data:
    vca-text-procblk-2.jpg.b147b1555b811df17be553b59a07dd14.jpg
    Easier for everyone, and a lot less work to implement.
  • Regarding
    10 hours ago, Steve Shanbaum said:

    as it's entirely true that we have a number of other datasets that are necessarily ordered and contiguous

    These all have to be ordered, though, not just in terms of what they relate to (e.g. this R is related to that TRCFMT ), but also on individual sequence (this R is immediately after that R which is immediately after that R...) .
    But these PROCBLK sections only has the first aspect (which since section/dataset they relate to), without having any care about internal order beyond that. Well, unless there's another dataset inside a block, but that one would need to be ordered anyway regardless of where it is, and is not related in order to other labels.
    Not using an identifier per bloc, justifies having these follow the block in order, since there isn't another way to do that first type of connection. But having an identifier means that this relation is already done, and doesn't have to be maintained further. At which point it's no different from any other label in the standard that can be anywhere, and not enforcing order is the common case on the standard, not the exception.

So, again, I like the prefix/identifier idea and think it was a good suggestion, but I don't see the benefit if it isn't used to remove the order requirement.

Whether the order requirement is a good idea, is a different topic. I like "no" as the answer, but am not sure there's a good enough reason to add the effort of identifiers for it.

What I see as potential reasons/usage-cases for not forcing order is mainly that I'm not sure there is a benefit to force order, and so why force a limitation without good reason?
There are two main not-entirely-haphazard ways to send this type of data (main common record pool, and per-case exceptions):

  • By case/device. As in the current ordered examples, an entire section for the general/common pool, and individual sections per case. It makes sense for the device/program creating it if they order things logically per case, and is useful when reading it (by a person) caring more about "what should everything be for case/device X?"
  • By record/label, as is currently done most of the time, and as in the previous/ordered case is done internally per each section. All LMATID are together, all GAX are together, and so on. It makes sense for the device/program creating it if they iterate over available labels, and is useful when reading it (by a person) if caring more about "what is going on with label X?" .

I'm not at all sure that the second scenario is more common/important/useful than the previous one. It's just that I'm also not at all sure that the previous one is that much more common/important/useful than this one. But forcing order enforces the previous one, while not forcing order leaves the decision to whatever creates the data blocks (nothing prevents them from still listing everything in order per PROCBLK, they just don't have to if they think it will be better otherwise).

 

The main reason I saw mentioned for forcing order relates to:

8 hours ago, Steve Shanbaum said:

why a particular device is receiving a given LMATID, it'd be much faster to search through that text for "LMATID" and run through the results and immediately determine which device-specific line applied.

While I don't think scrolling a few lines up to PROCBLK is a problem, it can be a noticeable increase in effort to then search for "PROCBLK=prefix" and go back (minor for someone used to taking advantage of text editor features like bookmarks and "go-to-previous-location", but a bigger hassle otherwise (due to either habits or using a limited viewing tool).

On the other hand, while the above problem is entirely true if checking a single label (e.g. LMATID), it's not so much if caring about multiple labels anyway. Since the work (scrolling up, or search then go-back) it split among multiple labels. And so that real but small effort would quickly become relatively negligible.

Also, as a person going over a file that has multiple PROCBLK sections, I won't start by searching for LMATID, I'd start by searching for PROCBLK. Because there can be more than 1 that affects a device. e.g. in:
vca-text-procblk-3.jpg.e346b15a256cc49768899e369da31bd3.jpg
If I need to know what happened for FSG;OPT;PLR then the labels needed may be spread among general/common ones, block 4, and block 5, in that increasing order of priority. At which point sorting out, say, all of LMATID, GAX, LIND:
vca-text-procblk-4.jpg.38af37ce35a354acbf5ff67a9007e83d.jpg
I have a very easy time finding which one I care about, and I already know what index values I want and which I don't. And this is from a file where they're not in order. Actually I think that having the identifiers (e.g. look for 4 or 5) makes it easier than if everything was in a line. Once you know what "5" is, it's easier to keep parsing (visually) than repeated "FSG;OPT;PLR;", it's less visually condensed and information-heavy.

Having these in order would make things a bit easier here, but isn't really that much of a difference.

 

 

Link to comment
Share on other sites

Unless you guys feel further need to discuss I’m reading a consensus here.  If I’m understanding the discussion at this stage then I will move forward with adding the proposal the group agreed upon at the meeting to 3.12 as it seems there is further consensus here that the prefix method is “good enough”.  Further comments can be collected as part of the review process for 3.12 if necessary.  If someone feels that my conclusion is premature please let me know.

Link to comment
Share on other sites

Actually 3 more comments. All of these are also relevant to original PROC, which entered without dealing with them, so probably don't justify delaying adding PROCBLK to the standard if there isn't a quick agreement on them:

1. Record length. If any record can reach the 80 characters limit, then PROC (and prefixed records in PROCBLK section) carrying the same data would overrun that limit. And these records must be able to carry the same data, so by definition can overrun the limit. There aren't any really good options beyond dropping the limit altogether, which I personally suspect is fine (probaby nobody reads the data on fixed-length terminal screens, which I assume was the original reason), but it seems somewhat out of scope to just happen as an implicit byproduct of adding PROC+ without further discussion.

2. What is the handling of non-singular records? That is, of course there is only one LIND so if there is a better-fitting one in PROC+ it takes precedence. But what if a label makes sense multiple times, like for example if someone sends ENGMARK inside a PROCBLK? Should these replace ENGMARK (and such) records in lower-fit (or general) areas? Be added to them always and never replace them? Should there be a further modifcation to the structure to have a way to indicate this explicitly, since both could make sense depending on the individual case?

3. If something is defined/replaced in a lower-fitting area, and then not mentioned in an existing higher fitting one, should they be taken from general area or the best-existing area? I expect the intent is the latter, but without it being explicitly stated in the standard it's entirely possible for an LMS to decide to only consider the general and best-fit areas and ignore interim ones.

To clarify, if that was too unspecific, imagine:

A=1;1
B=1;1
PROCBLK=1;DEV;
1!A=2;2
1!B=2;2
PROCBLK=2;DEV;VEN
2!A=3;3

For devices not of type DEV both labels are 1. For devices of type DEV by a vendor that isn't VEN, both are 2. For devices of type DEV from vendors VEN, it is clear that A is 3, but it is not explicitly clear that B is 2 and not 1. Should the logic be "try to find the best fit for each label", or should it be "find the best fit block for each device and only use that with the general/global labels"? Again, I'm pretty sure it's the first, but see how the second could seem like a reasonable interpretation, so whichever is true should be mentioned explicitly.

Link to comment
Share on other sites

In response to your three points:

1.  We will be polling companies to determine the impact of relaxing the 80 character limit.  The consensus at the meeting was similar to yours, that it is probably OK but we will need to check before making the change.

2.  &  3.  If I understood the committee's intentions correctly, all matching DEV would get A=2;2 and B=2;2.  All matching DEV;VEN would get A=3;3 and B=2;2.  My impression and belief is that inheritance was an expected benefit of this method.  However, it may not be that everyone on the committee was thinking the same in that regard.  I suppose we shall see based on this post.

An interesting case to consider is that if for some reason you did not want your more specific DEV;VEN to get that overridden value you would have to do something like this:

A=1;1 
B=1;1 
PROCBLK=1;DEV; 
1!A=2;2 
1!B=2;2 
PROCBLK=2;DEV;VEN 
2!A=3;3
2!B=1;1

That allows as much overriding and inheritance as one would prefer I think. Does anyone see any pitfalls with this approach?  Or have I misinterpreted the inheritance discussions?

Link to comment
Share on other sites

Greetings,

I concur 100% with those propositions. The best fit should be found each time (DEV+VEN+MODEL+S/N > DEV+VEN+MODEL > DEV+VEN > DEV > master LMS) for each record.

If you do not want the parent value to be inherited, you just explicitely respecify its value in the child block, as in Paul's example.

Thank you!

Link to comment
Share on other sites

6 hours ago, Etienne JULLIEN said:

Greetings,

I concur 100% with those propositions. The best fit should be found each time (DEV+VEN+MODEL+S/N > DEV+VEN+MODEL > DEV+VEN > DEV > master LMS) for each record.

If you do not want the parent value to be inherited, you just explicitely respecify its value in the child block, as in Paul's example.

Thank you!

Etienne, do you and your team want to draft the initial language to put into the document?  If you guys draft the initial language then I can insert it into the document and Robert can edit it as he sees fit.  Just need a starting point and I think the proposal has changed quite a bit.  What do you think?

Link to comment
Share on other sites

  • 2 weeks later...

Sorry, accidentally posted the previous one too soon. Full one follows here:

 

One comment,

Sorry for not catching this earlier, this is per my #2 points earlier, which on re-read Paul's response didn't explicitly cover, and I think that the implicit assumption (now made explicit with the ENGMARK example in the modified proposal, which otherwise on a quick read seems good) now is wrong.

Specifically, everything now seems fine for most labels, for which there is only a single valid value (e.g. GAX for a lens, for the same machine, cannot be both 90 and 0).

The problem is for labels which can have multiple different copies, all valid. The main example that springs to mind for me (since it's what our machines deal with) is ENGMARK, but I'm sure this isn't the only case (like maybe DRILLE records? others?).

That is, on data packet like

DO=B
ADD=1.75;2.50
ENGMARK=MASK;MainDesign;....
ENGMARK=TXT;A;....
ENGMARK=TXT;B;....
ENGMARK=DCS;ADD;....
...

*all* of the ENGMARK records will be used (depending on other data labels like front/back left/right and ENG/INK, not relevant in this context). An engraver would engrave all 4, the main design mask, two explicit texts, and one text with value of ADD.

So then what happens on:

DO=B
ADD=1.75;2.50
ENGMARK=MASK;MainDesign;....
ENGMARK=TXT;A;....
ENGMARK=TXT;B;....
ENGMARK=DCS;ADD;....
...
PROCBLK=1;ENG;VEN1;...
1!ENGMARK=MASK;ExtraVen1...

With the current (and logical and correct) attitude towards labels in general, the engraver by VEN1 would *only* engrave the mask "ExtraVen1". That's... a valid interpretation. But I think in most cases for such labels it would make more sense if this was in addition, rather than a replacement (that is, all engravers would have 4 things to engrave, ExtraVen1's engraver will have 5).

Both can make sense in different situation (which is why maybe worth considering in the future so more explicit way to decide, but that can get very complicated very fast). But from what I can think about in these situations I'd assume that "add these" may make more sense, more often, than "replace these".

Though opinions from anyone who may theoretically send these things would certainly be better than mine in this context, as to the likeliest interpretation.

(And I'm also thinking about a natural expansion of the usage of these, in labs where maybe they have another system that knows of some "tweaks" a few devices need which may not be considered by the LDS. It can make a lot of sense for these to also send additional job data to an LMS taking advantage of PROC/PROCBLK, and again in those cases single-value labels make absolutely sense for replace/override, but multiple-valid-values labels could make more sense as "also add this for that device", which I know for sure is a wanted usage since we already have labs doing this in general)

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.