IODevice ownership and destruction

Jonathan Hendry's Avatar

Jonathan Hendry

27 Jul, 2010 07:57 PM

We're having a problem with our LabJack driver not letting go of the USB port when an experiment is unloaded after being run.

If you load the experiment, then don't run it, then unload it, the destructor is called as expected (although the call stack reveals it is being called when an IODeviceVariableNotification is being destroyed, which is not what I would expect.)

If you load the experiment, then run it, then stop it, then unload the experiment, the destructor is not being called. As a result the cleanup is not happening and the USB port is not being released. If we then load the experiment, the load fails, because the driver can't acquire the USB port.

It's not clear why the destructor isn't being called after a run, but I suspect a shared_ptr is involved, keeping it alive. Some of the shared_ptrs in the API, such as the one held by IODeviceVariableNotification, probably ought to be weak_ptrs because they don't really 'own' the device, and ownership should be limited to the IODeviceManager.

Any suggestions on tracking down the ownership issue, or otherwise making sure the device is destroyed and cleaned up when we want it to be?

Thx,

Jon

  1. Support Staff 1 Posted by Christopher Sta... on 28 Jul, 2010 04:59 PM

    Christopher Stawarz's Avatar

    Hi Jon,

    I've reproduced the problem (using a FakeMonkey instance as the I/O device, so it's not specific to your plugin). You're probably right that there's a shared_ptr hanging around somewhere. I'll dig around some more and see if I can figure out what's happening.

    Thanks,
    Chris

  2. Support Staff 2 Posted by Christopher Sta... on 28 Jul, 2010 05:34 PM

    Christopher Stawarz's Avatar

    Hi Jon,

    Quick question: Does your device class derive from IODevice or LegacyIODevice?

    Thanks,
    Chris

  3. 3 Posted by Hendry, Jonatha... on 28 Jul, 2010 06:05 PM

    Hendry, Jonathan's Avatar

    IODevice. It was written for 0.4.3.

  4. Support Staff 4 Posted by Christopher Sta... on 28 Jul, 2010 07:00 PM

    Christopher Stawarz's Avatar

    using a FakeMonkey instance as the I/O device, so it's not specific to your plugin

    I take that back. Although I did get the same symptoms with both FakeMonkey and MSSWGamepad, it turns out that the plugins themselves were at fault.

    Both plugins were creating ScheduleTask instances, which they held onto via shared_ptr's but failed to release in stopDeviceIO. Since the ScheduleTask instances hold a shared_ptr to the IODevice, this left a reference cycle in place after the experiment was run and closed, meaning neither object was destroyed.

    I resolved the problem by adding a bit of code like the following to stopDeviceIO:

    if (schedule_node) {
        schedule_node->cancel();
        schedule_node.reset();  // Release the reference
    }
    

    Maybe your plugin needs a similar fix?

    IODevice. It was written for 0.4.3.

    OK. Although you shouldn't need to, if you do end up subclassing LegacyIODevice, be sure that LegacyIODevice::stopAllScheduleNodes gets called in your stopDeviceIO. (One option is just to call LegacyIODevice::stopDeviceIO for your subclass's stopDeviceIO.)

    Chris

  5. 5 Posted by Hendry, Jonatha... on 28 Jul, 2010 07:11 PM

    Hendry, Jonathan's Avatar

    Thanks! I'll check what we're doing.

    - Jon

  6. 6 Posted by Hendry, Jonatha... on 28 Jul, 2010 07:20 PM

    Hendry, Jonathan's Avatar

    We have:

    if (pollScheduleNode != NULL) {
    boost::mutex::scoped_lock(pollScheduleNodeLock);
            pollScheduleNode->cancel();
        }

    We have another scheduleNode that we use, for a non-repeating timed function invocation to turn off the reward. That one is only canceled in the destructor.

    However, I'm seeing the problem even if I comment out the code blocks where the scheduleNodes are created and bound.

    - Jon

  7. Support Staff 7 Posted by Christopher Sta... on 28 Jul, 2010 07:59 PM

    Christopher Stawarz's Avatar

    In addition to cancel(), you need to call reset():

    pollScheduleNode->cancel(); 
    pollScheduleNode.reset();
    

    Otherwise, you keep a reference to the ScheduleTask instance, and it can't be deallocated.

    However, I'm seeing the problem even if I comment out the code blocks where the scheduleNodes are created and bound.

    OK, but I still suspect the problem is in your code. If you send me a URL, I'll take a look and see if I can spot any issues.

    Chris

  8. 8 Posted by Jonathan Hendry on 29 Jul, 2010 11:16 PM

    Jonathan Hendry's Avatar

    Hm. I'm not getting anywhere with this. And I noticed no use of reset() on shared_ptrs in other devices.

    I've committed some changes.

    The repository is:
    git@github.com:jonhendry/MWorks.git

    The branch is wip/labjack

    The commit id is 937bc6a417533078d3c6

    (Not sure if there's a better way of getting a URL you can use from github.)

  9. Support Staff 9 Posted by Christopher Sta... on 30 Jul, 2010 06:52 PM

    Christopher Stawarz's Avatar

    (Not sure if there's a better way of getting a URL you can use from github.)

    This works:

    http://github.com/jonhendry/MWorks/tree/937bc6a417533078d3c6

    And I noticed no use of reset() on shared_ptrs in other devices.

    Normally, you don't need it. But you do need it in the situation I described, where there's a reference cycle between the IODevice and a ScheduleTask (because each holds a shared_ptr to the other).

    In your code, the only place you would need it is in LabJackU6Device::stopDeviceIO, where you cancel pollScheduleNode (and probably should cancel pulseScheduleNode, too). I say "would" because you've presumably eliminated the reference cycle by passing weak_ptr's to boost::bind when creating the schedule nodes.

    As it is, I don't see any obvious reference cycles or leaks in your code. However, I also don't think the problem you're experiencing applies to IO devices in general, since (after making the bug fixes I described previously) the destructors for FakeMonkey and MSSWGamepad are being called consistently. At least, that's true with the latest HEAD code. If you're still using your own fork of "0.4.3", then there may be other issues that you have and I don't.

    As a workaround, you could make the USB port handle a static class member, so that it's shared by all instances. That way, if one instance fails to release it, the next one can take over ownership. (This assumes that you only have one physical device.)

    Chris

  10. 10 Posted by Hendry, Jonatha... on 30 Jul, 2010 08:15 PM

    Hendry, Jonathan's Avatar

    Thanks, Chris.

    Perhaps the thing to do is try it on 0.4.4, and if that doesn't help, go to the static member.

    - Jon

  11. Support Staff 11 Posted by Christopher Sta... on 08 Apr, 2011 03:20 PM

    Christopher Stawarz's Avatar

    I believe this issue was resolved a while back. (See this dicussion and the accompanying bug fix). If you run into it again, please let us know.

    Chris

  12. Christopher Stawarz closed this discussion on 08 Apr, 2011 03:20 PM.

Comments are currently closed for this discussion. You can start a new one.

Keyboard shortcuts

Generic

? Show this help
ESC Blurs the current field

Comment Form

r Focus the comment reply box
^ + ↩ Submit the comment

You can use Command ⌘ instead of Control ^ on Mac

Recent Discussions

17 May, 2022 02:12 PM
16 May, 2022 03:12 PM
04 May, 2022 06:02 PM
03 May, 2022 01:30 PM
02 May, 2022 10:47 PM