Missing events in the mwk2 file
Hi Chris,
I am currently missing a lot of events in my mwk2 files. During the session, the behavioural window shows at least 1700 trials, but when i extract the data i only see around 500 events. The file size is also a lot smaller than before; running this experiment before would yield around 1.8GB of data for the duration of the session (2 hours with eye tracking info).
Any idea to what can cause this and how to test it?
Cheers,
Beshoy
Comments are currently closed for this discussion. You can start a new one.
Keyboard shortcuts
Generic
? | Show this help |
---|---|
ESC | Blurs the current field |
Comment Form
r | Focus the comment reply box |
---|---|
^ + ↩ | Submit the comment |
You can use Command ⌘
instead of Control ^
on Mac
Support Staff 1 Posted by Christopher Sta... on 23 Apr, 2019 05:58 PM
Hi Beshoy,
Some questions:
Thanks,
Chris
2 Posted by beshoy.agayby on 23 Apr, 2019 06:05 PM
Hi Chris,
Let me start from the last question. The file I get now is around 500MB. It's not the overall events nor necessarily for one events. What I meant is that if I have 1000 trials, i only find the data for 500 of that. Sorry for the confusion. Finally, I am using Matlab.
Cheers,
Beshoy
Support Staff 3 Posted by Christopher Sta... on 23 Apr, 2019 06:20 PM
Hi Beshoy,
Thanks for the additional info.
What exactly are you counting to arrive at 500 trials? Are you looking at assignments to a particular variable?
Is it possible that some part of your analysis script assumes that
getEvents
returns events ordered by time (which isn't true for MWK2 files)?Chris
4 Posted by beshoy.agayby on 23 Apr, 2019 06:28 PM
Hi Chris,
For example, I have a variable that's assigned at the beginning and end of each trial (start and exit states) to make data extraction easier. That gives the smaller number. Aside from this, when I check the success, failure variables that are used in the behavioural window, I also find the total number to be a fraction from the trials during the experiment.
Cheers,
Beshoy
Support Staff 5 Posted by Christopher Sta... on 23 Apr, 2019 06:31 PM
Hi Beshoy,
Would it be possible for you to share the MWK2 file with me (via Dropbox, maybe?) If so, can you also tell me the name of the variable you assign at the beginning and end of each trial?
Thanks,
Chris
6 Posted by beshoy.agayby on 24 Apr, 2019 09:27 AM
Hi Chris,
Sorry for the late reply; I didn't have access to the files. I checked the time stamps of one variable (the stim updates for example) and I found that the elapsed time between the first and last event is roughly 30 minutes or so. From the experiment file I attached, you can see that the streaming started around 14:40 and the stopped at 16:30 (file last modified).
Below is the link for the file (let me know if it doesn't work). The variable is "afcc_onset_trial"; it's assigned to 0 after fixation of the animal and to 1 at the exit state. There are trials with fixation break which aren't counted in the behavioural window during the experiment and that I can exclude from my data after extraction by looking at the variable "fix_break".
https://newcastle-my.sharepoint.com/:u:/g/personal/nba41_newcastle_...
Cheers,
Beshoy
Support Staff 7 Posted by Christopher Sta... on 24 Apr, 2019 02:10 PM
Hi Beshoy,
Thanks for sharing the file. I'm looking at it now.
Chris
8 Posted by beshoy.agayby on 24 Apr, 2019 02:16 PM
Hi Chris,
Thanks for this. I have updated mworks to the latest nightly version and had another session today. I split the data into two files each for about an hour long. I again looked at the stim updates time stamps and found the same issue. One file ran for an hour and only found timestamps for 22 minutes and the other ran for an hour and a half and found timestamps for 26 minutes.
Cheers,
Beshoy
Support Staff 9 Posted by Christopher Sta... on 24 Apr, 2019 02:44 PM
Hi Beshoy,
Looking at the data file you shared, I see a gap of about 72 minutes between the RESERVED_TERMINATION_CODE event (the final event written to the file before it's closed) and the preceding event. This suggests that the MWorks thread responsible for writing events to the data file stopped receiving new events at some point, but did not otherwise fail (since it was still able to write the termination event). I'm trying to understand how this could be the case.
A few more questions:
Thanks,
Chris
10 Posted by beshoy.agayby on 24 Apr, 2019 02:48 PM
Hi Chris,
Thanks for the reply. I do get a warning and it's related to the eyelink plugin if I understand correctly; here it is:
WARNING: Scheduled task (/Users/cstawarz/Documents/mworks_buildbot/worker/build_for_macos/build/plugins/core/EyeLink1000/Eyelink.cpp:342: eye_tracker) falling behind, dropping 28 scheduled executions (priority = 2, interval = 1000, task = 0x600ed3ebcda0)
The IO devices that I am using are the ones for Eyelink adn NIDAQ; how can I recomplie them?
Cheers,
Beshoy
Support Staff 11 Posted by Christopher Sta... on 24 Apr, 2019 02:55 PM
Hi Beshoy,
I don't think the EyeLink warning is relevant.
By "custom plugins", I meant plugins that aren't included in the nightly build, which you built and/or installed yourself. I presume you're just using the EyeLink and NIDAQ plugins that came with the nightly build, correct? If so, then those are always up to date.
I asked about custom plugins, because using a plugin compiled for an old MWorks build with a newer MWorks build can lead to memory corruption. The symptoms of that are usually quite obvious (e.g. the server crashes, or the plugin fails to load), but I wanted to check.
Chris
12 Posted by beshoy.agayby on 24 Apr, 2019 02:58 PM
Hi Chris,
Aha, I get it now. Yes, i used the one provided; we had one custom plugin for a DAQ in the past but we moved away from this ages ago.
Cheers,
Beshoy
Support Staff 13 Posted by Christopher Sta... on 24 Apr, 2019 04:27 PM
Hi Beshoy,
Some more questions:
Is your MWorks data file directory (
$HOME/Documents/MWorks/Data
) located on a local drive, i.e. internal or attached via USB or Thunderbolt? Or is it on a network-attached drive (presumably mapped in to your home directory via a symbolic link)?The time stamps in your event file indicate that MWServer had been running for over 13 days. Is that correct? If so, that's impressive :)
In the MWorks data directory, are there any
-journal
files associated with the data files that are missing events (e.g.run1.mwk2-journal
forrun1.mwk2
)? Normally, these should be present only when the data file is open and removed when it's closed.Thanks,
Chris
Support Staff 14 Posted by Christopher Sta... on 24 Apr, 2019 04:31 PM
Also, on the machine running the MWServer instance that's generating the incomplete data files, can you run the following command in Terminal and send me the output?
Thanks,
Chris
15 Posted by beshoy.agayby on 25 Apr, 2019 11:02 AM
Hi Chris,
It's the directory that you described, we didn't change the default one. Instead, we copy the files afterwards. As the computer is always running, we don't really have a reason for disconnecting the server. I will try closing the connection and seeing what happens; maybe some funny business is causing this issue :) .
I saw the journal file you describe and indeed it disappears once the connection is closed. I have also attached the output of the terminal command.
Cheers,
Beshoy
Support Staff 16 Posted by Christopher Sta... on 25 Apr, 2019 02:35 PM
Hi Beshoy,
I mostly asked as a sanity check. Also, while there's no reason that you shouldn't be able to run MWServer for that long, I'd never seen it done before, hence my interest.
In any case, I don't think this issue could be related to the length of time MWServer has been running, because you just reproduced it with the latest night build. One question: Did you observe the data loss on the first run after launching MWServer? Or did it only begin after several runs?
Thanks for that. The only thing that catches my eye there is that your iMac is running macOS 10.14 (aka 10.14.0) rather than 10.14.4. Apple's scant release notes don't note any bug fixes that seem relevant to this problem. However, just to cover all the bases, could you try upgrading to 10.14.4?
Also, have you observed the data loss problem on any other machines?
On my end, I tried running a long data collection on my Mac Pro, using a slightly-modified version of the RSVPDemo experiment that ships with MWorks. I ran for over an hour, ultimately generating over 800MB of data. During the run, the MWK2 file steadily grew in size, and later analysis didn't reveal any missing events. This is only one data point, of course, but it at least establishes that things work as intended some of the time.
As a next step, I'm going to prepare a special build of MWorks that does detailed logging of what's happening in the data-file writing code. I'll then run this on all of my test machines, and I'll ask you to install and run it on your iMac. Hopefully this will provide some insight in to the issue.
Thanks,
Chris
Support Staff 17 Posted by Christopher Sta... on 25 Apr, 2019 04:52 PM
Hi Beshoy,
This is done. You can grab the installer at
https://www.dropbox.com/s/6x27qaldk02erml/MWorks-event_file_debug.d...
This build is identical to the current nightly, except that it logs details of the event-file writing process to the file
/tmp/mwserver_event_file_log.txt
.Can you please install this build on the machine that's had data loss and try running your experiment with it? If you observe the data loss again, please send me the log file mentioned above.
On my end, I'm about an hour into running this build on my three test Mac's, and I haven't seen any problems so far.
Thanks,
Chris
Support Staff 18 Posted by Christopher Sta... on 25 Apr, 2019 08:25 PM
I just stopped this test. After running for over four hours, all three Mac's had generated MWK2 files over 2GB in size. Right up to the point when I stopped the experiment, the event files were steadily growing. The logs indicated no problems whatsoever -- just a steady stream of events successfully written.
Chris
19 Posted by beshoy.agayby on 30 Apr, 2019 11:16 AM
Hi Chris,
Sorry for the late reply. I ran the experiment on Friday using the Mworks version you sent. It ran for over an hour and a half; even though the file size is larger than before, I think the issue is still there. I have checked the stim updates and it shows a run time of 30 mins or so. I have included a link of the file for you to check. Additionally, I would like to know if I can get the time on the experiment computer from the system variable.
https://newcastle-my.sharepoint.com/:u:/g/personal/nba41_newcastle_...
Cheers,
Beshoy
Support Staff 20 Posted by Christopher Sta... on 30 Apr, 2019 12:25 PM
Hi Beshoy,
Thanks, but what I really need is the special log file:
/tmp/mwserver_event_file_log.txt
. It's overwritten every time you open an event file, so unless you haven't run any experiments since you generated the last faulty data file, I'll need you to generate another one. Sorry for not being clearer.Chris
21 Posted by beshoy.agayby on 30 Apr, 2019 01:43 PM
Hi Chris,
Found it, I attached it here.
Cheers,
Beshoy
Support Staff 22 Posted by Christopher Sta... on 01 May, 2019 02:38 PM
Hi Beshoy,
Thanks for sending the log file. It has shed considerable light on the problem.
It appears that the data-file writing code isn't losing track of any events, and it keeps on writing them to disk right up until the point that you close the data file. However, at some point in the experiment, the rate at which events are written falls drastically below the rate at which events are generated, producing an enormous backlog (e.g. 30 minutes or more) of events. When you tell MWorks to close the data file, the data-file writing code stops as soon as possible, and any events in the backlog are never written to disk.
So there are two problems here:
Problem 2 is certainly fixable on my end. I'm working on that now, and I'll get an updated build to you ASAP.
Problem 1 is a bit of a mystery. I can think of three explanations why a backlog of events might be building up:
Looking at your data file, I see no support for explanation 1. Your experiment seems to generate a perfectly reasonable number of events per unit time.
I also find explanation 2 dubious. If the CPU were really that overloaded, you would probably be noticing it in myriad ways (e.g. long application opening times, slow user interfaces). Unless that has been your experience, I'm inclined to rule this out, too.
That leaves us with explanation 3. If disk I/O is taking an exceptionally long time, then I'd expect that either
You can get a sense of the system's disk I/O activity by opening the Activity Monitor application and switching to the "Disk" tab. If you see huge numbers by "Data read/sec" or "Data written/sec", then that may be the underlying issue.
On the other hand, if the issue is with the health of the disk, then I'm not sure how best to diagnose that. I suppose you could try running Disk Utility's "First Aid" procedure on it, as that may find some problems.
As I said, I'm working on a new MWorks build for you to try. In this build, I'll include some additional logging that will hopefully clarify why the event backlog is happening. I'll let you know as soon as I have it ready.
Thank you for your patience and assistance in debugging this issue!
Chris
Support Staff 23 Posted by Christopher Sta... on 01 May, 2019 05:40 PM
Hi Beshoy,
The new MWorks build is ready. You can grab the installer at
https://www.dropbox.com/s/paazaumgbp2o9t4/MWorks-event_file_debug_2...
As with the previous build, this one logs details of the event-file writing process to the file
/tmp/mwserver_event_file_log.txt
. However, the log includes more details than before.With this build, your event files should not be missing any events. (If they are, then I've done something wrong.) However, there should still be an event backlog. The evidence of this will be in the server console. Upon closing the event file, two messages will be logged to the console. For example:
The crucial point is that the first message is logged before the backlog is cleared, while the second message is logged afterward. Under normal circumstances, there should be no elapsed time between the two messages (as shown above). However, in the presence of an event backlog, there will be a substantial amount of time between them (e.g. 30 minutes or more, depending on how long the experiment ran and when the backlog started).
Therefore, be sure to wait for both messages before closing MWServer. If you don't, you will lose backlogged events (and may corrupt the event file). You may have to wait a long time, so be patient.
Please install this build on the machine that's had data loss and try running your experiment again. If you observe an event backlog (signaled by a long delay between the two messages as described above), please send me the special log file and, if possible, the event file.
Thanks,
Chris
Christopher Stawarz closed this discussion on 04 Sep, 2019 04:24 PM.