Missing events in the mwk2 file

beshoy.agayby's Avatar

beshoy.agayby

23 Apr, 2019 05:24 PM

Hi Chris,

I am currently missing a lot of events in my mwk2 files. During the session, the behavioural window shows at least 1700 trials, but when i extract the data i only see around 500 events. The file size is also a lot smaller than before; running this experiment before would yield around 1.8GB of data for the duration of the session (2 hours with eye tracking info).
Any idea to what can cause this and how to test it?

Cheers,
Beshoy

  1. Support Staff 1 Posted by Christopher Sta... on 23 Apr, 2019 05:58 PM

    Christopher Stawarz's Avatar

    Hi Beshoy,

    Some questions:

    1. Are you extracting data using Python or MATLAB?
    2. When you say you "see around 500 events", do you mean the whole file contains only 500 events, or there are only 500 events for a particular variable?
    3. You said around 1.8GB of data is typical. What is the size of the MWK2 file?

    Thanks,
    Chris

  2. 2 Posted by beshoy.agayby on 23 Apr, 2019 06:05 PM

    beshoy.agayby's Avatar

    Hi Chris,

    Let me start from the last question. The file I get now is around 500MB. It's not the overall events nor necessarily for one events. What I meant is that if I have 1000 trials, i only find the data for 500 of that. Sorry for the confusion. Finally, I am using Matlab.

    Cheers,
    Beshoy

  3. Support Staff 3 Posted by Christopher Sta... on 23 Apr, 2019 06:20 PM

    Christopher Stawarz's Avatar

    Hi Beshoy,

    Thanks for the additional info.

    What exactly are you counting to arrive at 500 trials? Are you looking at assignments to a particular variable?

    Is it possible that some part of your analysis script assumes that getEvents returns events ordered by time (which isn't true for MWK2 files)?

    Chris

  4. 4 Posted by beshoy.agayby on 23 Apr, 2019 06:28 PM

    beshoy.agayby's Avatar

    Hi Chris,
    For example, I have a variable that's assigned at the beginning and end of each trial (start and exit states) to make data extraction easier. That gives the smaller number. Aside from this, when I check the success, failure variables that are used in the behavioural window, I also find the total number to be a fraction from the trials during the experiment.

    Cheers,
    Beshoy

  5. Support Staff 5 Posted by Christopher Sta... on 23 Apr, 2019 06:31 PM

    Christopher Stawarz's Avatar

    Hi Beshoy,

    Would it be possible for you to share the MWK2 file with me (via Dropbox, maybe?) If so, can you also tell me the name of the variable you assign at the beginning and end of each trial?

    Thanks,
    Chris

  6. 6 Posted by beshoy.agayby on 24 Apr, 2019 09:27 AM

    beshoy.agayby's Avatar

    Hi Chris,

    Sorry for the late reply; I didn't have access to the files. I checked the time stamps of one variable (the stim updates for example) and I found that the elapsed time between the first and last event is roughly 30 minutes or so. From the experiment file I attached, you can see that the streaming started around 14:40 and the stopped at 16:30 (file last modified).
    Below is the link for the file (let me know if it doesn't work). The variable is "afcc_onset_trial"; it's assigned to 0 after fixation of the animal and to 1 at the exit state. There are trials with fixation break which aren't counted in the behavioural window during the experiment and that I can exclude from my data after extraction by looking at the variable "fix_break".
    https://newcastle-my.sharepoint.com/:u:/g/personal/nba41_newcastle_...

    Cheers,
    Beshoy

  7. Support Staff 7 Posted by Christopher Sta... on 24 Apr, 2019 02:10 PM

    Christopher Stawarz's Avatar

    Hi Beshoy,

    Thanks for sharing the file. I'm looking at it now.

    Chris

  8. 8 Posted by beshoy.agayby on 24 Apr, 2019 02:16 PM

    beshoy.agayby's Avatar

    Hi Chris,

    Thanks for this. I have updated mworks to the latest nightly version and had another session today. I split the data into two files each for about an hour long. I again looked at the stim updates time stamps and found the same issue. One file ran for an hour and only found timestamps for 22 minutes and the other ran for an hour and a half and found timestamps for 26 minutes.

    Cheers,
    Beshoy

  9. Support Staff 9 Posted by Christopher Sta... on 24 Apr, 2019 02:44 PM

    Christopher Stawarz's Avatar

    Hi Beshoy,

    Looking at the data file you shared, I see a gap of about 72 minutes between the RESERVED_TERMINATION_CODE event (the final event written to the file before it's closed) and the preceding event. This suggests that the MWorks thread responsible for writing events to the data file stopped receiving new events at some point, but did not otherwise fail (since it was still able to write the termination event). I'm trying to understand how this could be the case.

    A few more questions:

    • Are you seeing any error messages in the MWorks console while your experiments are running? (There were none in the data file.)
    • Are you using any custom core plugins (e.g. for I/O devices or stimuli)? If yes, have you re-compiled them against the version of MWorks you're running (always required when you upgrade MWorks)?

    Thanks,
    Chris

  10. 10 Posted by beshoy.agayby on 24 Apr, 2019 02:48 PM

    beshoy.agayby's Avatar

    Hi Chris,

    Thanks for the reply. I do get a warning and it's related to the eyelink plugin if I understand correctly; here it is:
    WARNING: Scheduled task (/Users/cstawarz/Documents/mworks_buildbot/worker/build_for_macos/build/plugins/core/EyeLink1000/Eyelink.cpp:342: eye_tracker) falling behind, dropping 28 scheduled executions (priority = 2, interval = 1000, task = 0x600ed3ebcda0)

    The IO devices that I am using are the ones for Eyelink adn NIDAQ; how can I recomplie them?

    Cheers,
    Beshoy

  11. Support Staff 11 Posted by Christopher Sta... on 24 Apr, 2019 02:55 PM

    Christopher Stawarz's Avatar

    Hi Beshoy,

    I don't think the EyeLink warning is relevant.

    By "custom plugins", I meant plugins that aren't included in the nightly build, which you built and/or installed yourself. I presume you're just using the EyeLink and NIDAQ plugins that came with the nightly build, correct? If so, then those are always up to date.

    I asked about custom plugins, because using a plugin compiled for an old MWorks build with a newer MWorks build can lead to memory corruption. The symptoms of that are usually quite obvious (e.g. the server crashes, or the plugin fails to load), but I wanted to check.

    Chris

  12. 12 Posted by beshoy.agayby on 24 Apr, 2019 02:58 PM

    beshoy.agayby's Avatar

    Hi Chris,

    Aha, I get it now. Yes, i used the one provided; we had one custom plugin for a DAQ in the past but we moved away from this ages ago.

    Cheers,
    Beshoy

  13. Support Staff 13 Posted by Christopher Sta... on 24 Apr, 2019 04:27 PM

    Christopher Stawarz's Avatar

    Hi Beshoy,

    Some more questions:

    • Is your MWorks data file directory ($HOME/Documents/MWorks/Data) located on a local drive, i.e. internal or attached via USB or Thunderbolt? Or is it on a network-attached drive (presumably mapped in to your home directory via a symbolic link)?

    • The time stamps in your event file indicate that MWServer had been running for over 13 days. Is that correct? If so, that's impressive :)

    • In the MWorks data directory, are there any -journal files associated with the data files that are missing events (e.g. run1.mwk2-journal for run1.mwk2)? Normally, these should be present only when the data file is open and removed when it's closed.

    Thanks,
    Chris

  14. Support Staff 14 Posted by Christopher Sta... on 24 Apr, 2019 04:31 PM

    Christopher Stawarz's Avatar

    Also, on the machine running the MWServer instance that's generating the incomplete data files, can you run the following command in Terminal and send me the output?

    system_profiler SPSoftwareDataType SPHardwareDataType SPDisplaysDataType SPStorageDataType
    

    Thanks,
    Chris

  15. 15 Posted by beshoy.agayby on 25 Apr, 2019 11:02 AM

    beshoy.agayby's Avatar

    Hi Chris,

    It's the directory that you described, we didn't change the default one. Instead, we copy the files afterwards. As the computer is always running, we don't really have a reason for disconnecting the server. I will try closing the connection and seeing what happens; maybe some funny business is causing this issue :) .

    I saw the journal file you describe and indeed it disappears once the connection is closed. I have also attached the output of the terminal command.

    Cheers,
    Beshoy

  16. Support Staff 16 Posted by Christopher Sta... on 25 Apr, 2019 02:35 PM

    Christopher Stawarz's Avatar

    Hi Beshoy,

    As the computer is always running, we don't really have a reason for disconnecting the server. I will try closing the connection and seeing what happens; maybe some funny business is causing this issue :) .

    I mostly asked as a sanity check. Also, while there's no reason that you shouldn't be able to run MWServer for that long, I'd never seen it done before, hence my interest.

    In any case, I don't think this issue could be related to the length of time MWServer has been running, because you just reproduced it with the latest night build. One question: Did you observe the data loss on the first run after launching MWServer? Or did it only begin after several runs?

    I have also attached the output of the terminal command.

    Thanks for that. The only thing that catches my eye there is that your iMac is running macOS 10.14 (aka 10.14.0) rather than 10.14.4. Apple's scant release notes don't note any bug fixes that seem relevant to this problem. However, just to cover all the bases, could you try upgrading to 10.14.4?

    Also, have you observed the data loss problem on any other machines?

    On my end, I tried running a long data collection on my Mac Pro, using a slightly-modified version of the RSVPDemo experiment that ships with MWorks. I ran for over an hour, ultimately generating over 800MB of data. During the run, the MWK2 file steadily grew in size, and later analysis didn't reveal any missing events. This is only one data point, of course, but it at least establishes that things work as intended some of the time.

    As a next step, I'm going to prepare a special build of MWorks that does detailed logging of what's happening in the data-file writing code. I'll then run this on all of my test machines, and I'll ask you to install and run it on your iMac. Hopefully this will provide some insight in to the issue.

    Thanks,
    Chris

  17. Support Staff 17 Posted by Christopher Sta... on 25 Apr, 2019 04:52 PM

    Christopher Stawarz's Avatar

    Hi Beshoy,

    As a next step, I'm going to prepare a special build of MWorks that does detailed logging of what's happening in the data-file writing code.

    This is done. You can grab the installer at

    https://www.dropbox.com/s/6x27qaldk02erml/MWorks-event_file_debug.d...

    This build is identical to the current nightly, except that it logs details of the event-file writing process to the file /tmp/mwserver_event_file_log.txt.

    Can you please install this build on the machine that's had data loss and try running your experiment with it? If you observe the data loss again, please send me the log file mentioned above.

    On my end, I'm about an hour into running this build on my three test Mac's, and I haven't seen any problems so far.

    Thanks,
    Chris

  18. Support Staff 18 Posted by Christopher Sta... on 25 Apr, 2019 08:25 PM

    Christopher Stawarz's Avatar

    On my end, I'm about an hour into running this build on my three test Mac's, and I haven't seen any problems so far.

    I just stopped this test. After running for over four hours, all three Mac's had generated MWK2 files over 2GB in size. Right up to the point when I stopped the experiment, the event files were steadily growing. The logs indicated no problems whatsoever -- just a steady stream of events successfully written.

    Chris

  19. 19 Posted by beshoy.agayby on 30 Apr, 2019 11:16 AM

    beshoy.agayby's Avatar

    Hi Chris,

    Sorry for the late reply. I ran the experiment on Friday using the Mworks version you sent. It ran for over an hour and a half; even though the file size is larger than before, I think the issue is still there. I have checked the stim updates and it shows a run time of 30 mins or so. I have included a link of the file for you to check. Additionally, I would like to know if I can get the time on the experiment computer from the system variable.

    https://newcastle-my.sharepoint.com/:u:/g/personal/nba41_newcastle_...

    Cheers,
    Beshoy

  20. Support Staff 20 Posted by Christopher Sta... on 30 Apr, 2019 12:25 PM

    Christopher Stawarz's Avatar

    Hi Beshoy,

    I have included a link of the file for you to check.

    Thanks, but what I really need is the special log file: /tmp/mwserver_event_file_log.txt. It's overwritten every time you open an event file, so unless you haven't run any experiments since you generated the last faulty data file, I'll need you to generate another one. Sorry for not being clearer.

    Chris

  21. 21 Posted by beshoy.agayby on 30 Apr, 2019 01:43 PM

    beshoy.agayby's Avatar

    Hi Chris,

    Found it, I attached it here.

    Cheers,
    Beshoy

  22. Support Staff 22 Posted by Christopher Sta... on 01 May, 2019 02:38 PM

    Christopher Stawarz's Avatar

    Hi Beshoy,

    Thanks for sending the log file. It has shed considerable light on the problem.

    It appears that the data-file writing code isn't losing track of any events, and it keeps on writing them to disk right up until the point that you close the data file. However, at some point in the experiment, the rate at which events are written falls drastically below the rate at which events are generated, producing an enormous backlog (e.g. 30 minutes or more) of events. When you tell MWorks to close the data file, the data-file writing code stops as soon as possible, and any events in the backlog are never written to disk.

    So there are two problems here:

    1. At some point, something is causing the rate at which events are written to the event file to plummet.
    2. MWorks isn't handling the backlog of unwritten events produced by problem 1.

    Problem 2 is certainly fixable on my end. I'm working on that now, and I'll get an updated build to you ASAP.

    Problem 1 is a bit of a mystery. I can think of three explanations why a backlog of events might be building up:

    1. The experiment is simply generating way more events per unit time than can be written to disk in that time.
    2. The CPU is overloaded, so the data-file writing code isn't running as often as needed.
    3. The actual writes to disk are taking an unreasonably long time.

    Looking at your data file, I see no support for explanation 1. Your experiment seems to generate a perfectly reasonable number of events per unit time.

    I also find explanation 2 dubious. If the CPU were really that overloaded, you would probably be noticing it in myriad ways (e.g. long application opening times, slow user interfaces). Unless that has been your experience, I'm inclined to rule this out, too.

    That leaves us with explanation 3. If disk I/O is taking an exceptionally long time, then I'd expect that either

    • other processes are also doing a lot of disk I/O, causing a lot of contention and slowing things down for everyone, or
    • there's a physical problem with the disk.

    You can get a sense of the system's disk I/O activity by opening the Activity Monitor application and switching to the "Disk" tab. If you see huge numbers by "Data read/sec" or "Data written/sec", then that may be the underlying issue.

    On the other hand, if the issue is with the health of the disk, then I'm not sure how best to diagnose that. I suppose you could try running Disk Utility's "First Aid" procedure on it, as that may find some problems.

    As I said, I'm working on a new MWorks build for you to try. In this build, I'll include some additional logging that will hopefully clarify why the event backlog is happening. I'll let you know as soon as I have it ready.

    Thank you for your patience and assistance in debugging this issue!

    Chris

  23. Support Staff 23 Posted by Christopher Sta... on 01 May, 2019 05:40 PM

    Christopher Stawarz's Avatar

    Hi Beshoy,

    The new MWorks build is ready. You can grab the installer at

    https://www.dropbox.com/s/paazaumgbp2o9t4/MWorks-event_file_debug_2...

    As with the previous build, this one logs details of the event-file writing process to the file /tmp/mwserver_event_file_log.txt. However, the log includes more details than before.

    With this build, your event files should not be missing any events. (If they are, then I've done something wrong.) However, there should still be an event backlog. The evidence of this will be in the server console. Upon closing the event file, two messages will be logged to the console. For example:

    00:00:18:  Closing data file...
    00:00:18:  Closed data file: /Users/cstawarz/Documents/MWorks/Data/test.mwk2
    

    The crucial point is that the first message is logged before the backlog is cleared, while the second message is logged afterward. Under normal circumstances, there should be no elapsed time between the two messages (as shown above). However, in the presence of an event backlog, there will be a substantial amount of time between them (e.g. 30 minutes or more, depending on how long the experiment ran and when the backlog started).

    Therefore, be sure to wait for both messages before closing MWServer. If you don't, you will lose backlogged events (and may corrupt the event file). You may have to wait a long time, so be patient.

    Please install this build on the machine that's had data loss and try running your experiment again. If you observe an event backlog (signaled by a long delay between the two messages as described above), please send me the special log file and, if possible, the event file.

    Thanks,
    Chris

Reply to this discussion

Internal reply

Formatting help / Preview (switch to plain text) No formatting (switch to Markdown)

Attaching KB article:

»

Attached Files

You can attach files up to 10MB

If you don't have an account yet, we need to confirm you're human and not a machine trying to post spam.

Keyboard shortcuts

Generic

? Show this help
ESC Blurs the current field

Comment Form

r Focus the comment reply box
^ + ↩ Submit the comment

You can use Command ⌘ instead of Control ^ on Mac