Friday, August 18, 2017

Dining Philosophers in LabVIEW

As I program in LabVIEW I am surprised by the number of times small bits of code cause large problems.  For example, I was working with a large system that communicated with many instruments.  I made some changes, tried running the code, and the program locked up! My gut reaction was to blame the LabVIEW development environment or a driver, but in the end I discovered the code I changed caused the hang.

Dining Philosophers
I did finally find what was causing the hang, but it took a long time.  The hang was caused by two resources that were used by two different pieces of code (see the Wikipedia’s Dining Philosopher article for a good explanation of this).  Both pieces of code had one of the resources, but neither would relinquish the resource because they were waiting for the second resource.
This can be shown in LabVIEW with the following code:

So Don't do That
The above example is simple enough to be trivially avoided.  However, it can get more subtle.  Consider a non-Reentrant SubVI; a second caller to the subVI will block until the subVI finishes on the first caller.  This allows the same hang to occur with a subVI and a queue as shown below.  Note that there is a second queue shown which is not causing the hang, but is needed for the subVI to be called twice: 

Still Want to Use References?
The code I had written was fixed easily.   The code had started before Data Value References (DVRs) existed in LabVIEW, so it used a Queue with a single element as a referenced resource.  I made a change to the data dependency so that the Dequeue would happen later and not block the other code from executing.   The experience did make me a little more cautious, and I’m on a quest to find out how to avoid these mistakes in the future.  Here’s what I’ve got so far:
·         Use a message-based architecture to avoid sharing resources:  I know many are successful with message-based architectures like the actor framework.  I cringe at doing this myself.  It works, but it is trading one set of problems for another.
·         Use the “In Place Element Structure” instead of Queues for references.  The In Place Element Structure groups the data value read in the same structure as the data value write.  This means that the read won’t happen until all the inputs to the structure are ready.  The Dequeue function in the example above happens before all the inputs are ready for the Enqueue function, and that allows the hang to occur.
·         Consider setting VI executions to reentrant.  When creating a new VI, the default is non-reentrant.  If a VI is non-reentrant, it is a resource that must be shared and it may block.
·          Be especially wary of code that must wait for multiple resources to be available.  For such code, see the “Dining Philosophers” link for ways to address the lockup issue.

The Good News
References are an intuitive way to program, and they are used frequently in LabVIEW.  Consider all the references in LabVIEW:
  • ·         Synchronization VIs: DVRs, Queues, Notifiers, etc.
  • ·         VI Server: VI References, Control References, etc.
  • ·         Driver APIs: DAQ Tasks, VISA Resources, FPGA References, etc.
  • ·         Computer Resources: File References, TCP Refnum, etc.
  • ·         Many, many others
You will use many references as you program in LabVIEW.  You may even start to use references in your APIs as I do.  If you have advice or a story about references locking up code, I’d love to hear it!  

No comments:

Post a Comment