ehFeedback error
Hi Chris,
We've been sporadically getting the attached error (ehFeedback error, see stdout), and shortly after mWorks crashes.
Do you have thoughts on what might be causing it? Anything we can do other than restarting the computer/labjack?
Thanks,
Lindsey.
-
Image_from_iOS.jpg 1.42 MB
Showing page 2 out of 2. View the first page
Comments are currently closed for this discussion. You can start a new one.
Keyboard shortcuts
Generic
? | Show this help |
---|---|
ESC | Blurs the current field |
Comment Form
r | Focus the comment reply box |
---|---|
^ + ↩ | Submit the comment |
You can use Command ⌘
instead of Control ^
on Mac
31 Posted by mhisted on 12 Jan, 2022 02:10 PM
I’ll ask if anyone has seen this on my end.
32 Posted by llglick on 31 Jan, 2022 06:39 PM
Hi Chris,
Do you have any new thoughts on this error?
We've also been getting this (or a similar) error when using a different experiment. It only happens sporadically, and unlike the other experiment I haven't figured out the conditions that cause it. This experiment only requires triggers sent from a pulse generator to run- so it will potentially be easier to replicate.
I've attached a screenshot of the error in the console window and the .xml that I was using.
Thanks,
Lindsey
Support Staff 33 Posted by Christopher Sta... on 01 Feb, 2022 03:04 PM
Hi Lindsey,
Apologies for not following up on this sooner.
This looks like a different issue from the one we fixed previously. The old issue led to errors when the plugin attempted to receive a response from the device. Based on the messages written to the server console and stdout, it looks like the new issue is due to a failure to send data to the device. Specifically, a call to LJUSB_Write is failing, which causes ehFeedback to fail, which causes ljU6ReadPorts to fail, which causes readLeverDI to call stopDeviceIO, which ultimately leads to more LJUSB_Write failures and more error messages. It seems like, after the first write failure, all subsequent write attempts also fail.
At the moment, I don't have any theories as to why writes to the device start failing. I'm going to try running the experiments you sent me, in the hope that I can reproduce the problem.
On your end, can you try running the failing experiment with LJ_DEBUG set to 1? To do this, edit line 47 in labjackusb.c to look like this:
then rebuild and reinstall your plugin. This will lead to a lot more messages on stderr. When the write failure happens, I'm expecting you'll see a message like
or
or some other message starting with "libusb error". If you can capture those and any subsequent stderr messages and send them to me, they may provide additional insight into what's going wrong.
Thanks,
Chris
34 Posted by llglick on 01 Feb, 2022 05:14 PM
Hi Chris,
So, now I have a crash with a different error and a different experiment.
Here are the logs from the terminal server and the console.
Lindsey.
Support Staff 35 Posted by Christopher Sta... on 01 Feb, 2022 09:39 PM
Hi Lindsey,
Thanks for sending the logs. It looks like the error is once again occurring in readLeverDI. However, there are two differences:
The stderr output contains exactly two error messages, both in response to the same LJUSB_Read failure:
So the read is timing out (after the default timeout interval of 1 second). ehFeedback neither detects nor handles the timeout and just treats it like a read failure ("response buffer is too small"). Given that subsequent attempts to communicate with the hardware succeed, it seems like retrying the read would probably succeed.
None of this explains why the read is timing out in the first place. One second is likely much longer than any normal sort of USB I/O delay, so it seems like something must be going wrong. The old issue was due to multiple, overlapping attempts by the plugin to communicate with the device. I know I went through the plugin code looking for other spots where that might happen, but maybe I missed something. I'll take another look.
FYI, I ran your VisStimRet experiment for several hours, using a simulated counter input, and I never saw any errors.
Cheers,
Chris
36 Posted by llglick on 01 Feb, 2022 09:52 PM
Hi Chris,
Thanks for the update.
I am not surprised you haven't been able to replicate the error on
VisStimRet yet- it has only happened twice so far. We have now created a
desktop command to open the Server through terminal so I'll be able to send
you full readouts of future errors.
Do you have any other suggestions moving forward?
Thanks,
Lindsey.
37 Posted by llglick on 02 Feb, 2022 07:57 PM
Hi Chris,
I had another crash today. I've attached the error logs and the mwel code
for this experiment. It runs like VistStimRet (with pulses needed to drive
transitions between task states).
Lindsey.
Support Staff 38 Posted by Christopher Sta... on 02 Feb, 2022 09:49 PM
Hi Lindsey,
Thanks for sending the additional logs. This looks like another occurrence of the new issue (LJUSB_Write failure). Instead of a timeout, the first write failure produces this error message:
and the subsequent write failures produce:
Looking at the libusb source code, it seems like LIBUSB_ERROR_IO could indicate a lot of different failure modes, but LIBUSB_ERROR_PIPE mostly seems to indicate that the "endpoint halted", which probably means the connection is dead, more or less. This supports the idea that communication with the device has failed for some reason, but it doesn't tell us whether that's due to a hardware failure or a software issue.
I went through your plugin code again today. I found one spot where multiple, overlapping attempts to communicate with the device could occur. However, it was in the initialize() method, which gets called exactly once when the experiment is loaded. It seems unlikely that an error there could lead to a failure much later in your experiment, but I guess it's possible. I'll send you a pull request with the fix, but I don't expect it will fix this issue.
As for what to do next: At this point, I think we have all the info we're going to get from setting LJ_DEBUG to 1, so you can set it back to 0. If it's a software issue, and I can reproduce it while running in the debugger, then I might be able to figure out what's happening. So far, though, I can't reproduce it at all. Maybe it's worth sending the terminal error messages to the LabJack folks to see if they have any insight?
Chris
39 Posted by llglick on 02 Feb, 2022 10:19 PM
Hi Chris,
I have reverted the LJ_DEBUG to 0 and I just pinged Labjack and will see
what they say.
The initial read error was very closely associated with updating to the
most recent version of MWorks. The write errors are more recent, but could
also be associated with the update given that they are much rarer. Is there
anything that changed in MWorks that could explain this?
Lindsey.
Support Staff 40 Posted by Christopher Sta... on 02 Feb, 2022 10:20 PM
Actually, before you do that, there's one more thing you could try. If you launch MWServer from Terminal with the following command:
then, in addition to the LabJack debugging messages, libusb errors and warnings should be written to stderr. Maybe you can try running that way until you get another failure, then send me the output (which hopefully will have even more specific info about what failed)?
Thanks,
Chris
41 Posted by llglick on 02 Feb, 2022 10:23 PM
Ok- I can try that.
However- is it possible that the extra overhead of the debugger is
triggering some of these crashes?
Lindsey.
Support Staff 42 Posted by Christopher Sta... on 02 Feb, 2022 10:32 PM
Hmm. MWorks 0.10 added the DATAPixx device, which also uses libusb. I see that libusb creates some "worker" threads when MWServer launches, even before an experiment is loaded. I guess it's possible that having two plugins both embedding libusb could lead to conflicts.
If you want to test this, remove the DATAPixx plugin. It's at
Then restart MWServer.
That's the only connection I can think of because the inner workings of your LabJack plugin and anything else in MWorks.
You aren't running a debugger. You're just producing a lot of output on stderr. I can't imagine how that could cause USB I/O failures.
Chris
43 Posted by llglick on 02 Feb, 2022 10:36 PM
Ok- I'll try deleting the DATAPixx.bundle
Thanks,
Lindsey.
Support Staff 44 Posted by Christopher Sta... on 03 Feb, 2022 02:13 PM
Hi Lindsey,
I submitted a pull request with the changes I mentioned. I haven't seen any failures while running twoStim_2P_Frames.
Chris
45 Posted by llglick on 04 Feb, 2022 10:02 PM
Hi Chris,
Your changes in the pull request seem to have solved the original
ehFeedback error (the read failure).
We'll see if it also solves the write failure issues (deleting the DATAPixx
bundle did not).
Thanks,
Lindsey.
46 Posted by llglick on 06 Feb, 2022 06:23 PM
Hi Chris,
Unfortunately, it did not solve the write error with VistStimRet.
I've attached the error logs.
Lindsey.
Support Staff 47 Posted by Christopher Sta... on 08 Feb, 2022 10:20 PM
Hi Lindsey,
I'm surprised but pleased to hear it!
OK. If nothing else, it's good to know that using libusb in multiple plugins isn't an issue.
Thanks for sending those, as they do include a little new information. Here's the relevant portion of the stderr log:
It looks like the initial write failure produces a message from libusb saying "device not responding", whereas subsequent attempts generate "pipe is stalled". This aligns with my previous description of this failure mode: Communication with the device fails for some reason and doesn't recover.
"Device not responding" certainly makes it sound like USB I/O with the device has broken. I don't see any reason to think the plugin has done something wrong: Everything appears to be running smoothly, then the connection up and dies. On the other hand, I can think of many possibilities elsewhere, such as
Unfortunately, I don't see any evidence to suggest that any of them is the cause, and I still can't reproduce the issue myself.
Are you seeing this issue with one more than LabJack? If so, that makes a hardware fault seem much less likely (though not impossible). Are you seeing it on multiple Mac's? If it only happens on one machine, then it's possible there's a fault in its hardware (or the macOS version it's running).
Chris
48 Posted by llglick on 08 Feb, 2022 10:57 PM
Hi Chris,
We are just seeing this issue on one computer because we only have one
version of this microscope.
Labjack also suggested it might be a hardware issue. I tried swapping out
the USB cable- that didn't help. I just tried moving to a new port on the
USB hub- but will now also try swapping out the hub. Any suggestions for
how to test for bugs in the libusb or driver?
One new observation- today it crashed repeatedly after very few trials
(this is unusual- it usually happens more rarely). After one crash we tried
just stopping and starting Mworks without closing out the session and it
just proceeded as if nothing had happened- maybe that re-initiates the
connection? Also, we then did restart Mworks, and it didn't crash again
(today) after that.
Lindsey.
Support Staff 49 Posted by Christopher Sta... on 10 Feb, 2022 05:46 PM
Hi Lindsey,
Probably the best thing would be to use the U6 with a Windows machine, since that would involve completely different USB driver code and different OS-specific code in libusb. But obviously you can't run your MWorks experiment under Windows, so that's not really an option.
Apart from that, someone with deep knowledge of libusb, the macOS USB driver stack, and/or the low-level USB I/O performed by the U6 might be able to assess whether things are working correctly on the software side. But I don't think I'd be able to figure it out.
Interesting. Starting the device (i.e. calling startDeviceIO) will close and re-open the USB connection, but only if the first attempt to configure the device fails. If this happens, the MWServer console should show a warning saying:
Is that what you see?
Chris
50 Posted by mhisted on 10 Feb, 2022 05:57 PM
>
> LJU6 found dead, restarting. (bug if not on MWServer restart)
Back in the day, when I wrote the code that emits this, I did this restart because the Labjack could occasionally become unreachable sometime between MWorks stop and MWorks restart.
At some point, this stopped being a problem so we rarely see this message now.
At the time, if I remember correctly, I patched libusb to do some kind of deep USB restart, that was possible with stock libusb on Linux but not on Mac. The Mac version of the libusb restart/reset call didn't do much. Again, this is a long time ago so I don't have a clear memory. Depending on what's going wrong, it might be possible to do "stronger" resets of the Mac USB stack from MWorks.
51 Posted by llglick on 15 Feb, 2022 12:48 AM
Hi Chris,
I think the problem might have been a faulty USB hub. We swapped it out on
Thursday, and haven't had a crash since. Will let you know if it comes up
again, but hopefully the issue is closed (and we've identified what that
failure mode looks like).
Thanks,
Lindsey.
Support Staff 52 Posted by Christopher Sta... on 15 Feb, 2022 02:32 PM
OK, thanks for the update.
Chris
Christopher Stawarz closed this discussion on 08 Mar, 2022 02:57 PM.