[MLC@Home] [TMIM Notes] Aug 6 2021

News and Information related to Distributed Computing
Post Reply
Posts: 997
Joined: Sun Nov 08, 2020 3:51 pm

[MLC@Home] [TMIM Notes] Aug 6 2021

Post by BOINC_News »

This Month in MLC@Home Notes for Aug 6 2021
A monthly summary of news and notes for MLC@Home

Another month of good progress on MLC! First, this past month saw the completion of DS3! You have trained over 1,000,000 neural networks for DS3, which is a huge accomplishment. We're continuing to bundle and evaluate the dataset, so look for a complete public release shortly.

We also spend some time on the backend preparing for DS4. We've updated the website to show DS4 progress, but haven't sent any DS4 WUs yet. Instead we spent the bulk of the month trying to get the new client to work under Windows, which hasn't been going well. We spent a good 2.5 weeks trying to get pytorch and the client to compile (and run) statically on windows. Even though it now compiles, the client crashes when running. So last week we switched back to linking dynamically, and want to get an updated windows client out this weekend. The Linux/CPU version of the new client appears to be performing fantastically, so thanks to everyone who ran WUs from the "mldstest" queue tested!

DS4 WUs are incompatible with the older client, so we'll only release DS4 WUs as the new (v9.9x) client become available for each platform. This means CPUs first. GPUs will continue to work on finishing up DS1 and DS2.

Speaking of DS1/DS2, we're approaching the end of DS1 with only a few more weeks to go. and when we complete those networks we'll switch to DS2 to finish those up as well.

So, lots of movement this month behind the scenes, and great progress on the existing datasets. If *any* windows developers would like to help us out getting the new windows CPU client out the door, please contact us directly, we could use the help.

Other News
  • GPU and CPU queues are stacked with DS1/DS2 WUs until those completes and/or until DS4 is ready. The CPU queue will transition to DS4 first, as we assume GPU builds will be even more of a headache than the CPU ones have been.
    Because existing WUs are incompatible with the new client, we've been keeping the CPU and GPU queues a little less full than we have in the past, because when the new client comes out, we don't want to have to cancel a bunch of existing WUs to be replaced with new ones. Unfortunately, since the updated windows CPU client is taking a longer than planned, we've run out of WUs a few times in the past month. We're trying to stay on top of this, and automating the process to keep them from running completely dry in the future.
    We have a new developer who has joined the team, who is working on more graceful handling of NaN errors, which has been an issue for a long time. If that's ready before the windows CPU client is ready, then that fix will be in the next release. You can see more in the #devel channel on Discord, or the issue on gitlab.
    Reminder: the MLC client is open source, and has an issues list at gitlab. If you're a programmer or data scientist and want to help, feel free to look over the issues and submit a pull request.
Project status snapshot:
(note these numbers are approximations)



Last month's TMIM Notes: Jul 1 2021

Thanks again to all our volunteers!

-- The MLC@Home Admins(s)
Homepage: https://www.mlcathome.org/
Discord invite: https://discord.gg/BdE4PGpX2y
Twitter: @MLCHome2

Source: https://www.mlcathome.org/mlcathome/for ... php?id=225
Post Reply