I primarily built this for group in-person listening, and that's what the spatial audio controls are for. But what is interesting is that since it only requires the browser, it works across the internet as well. You can guarantee that you and someone else are listening to the same thing even across an ocean.
Someone brought up the idea of an internet radio, which I thought was cool. If you could see a list of all the rooms people are in and tune it to exactly what they're jamming to.
Ne02ptzero 4 hours ago [-]
> You can guarantee that you and someone else are listening to the same thing even across an ocean.
How can you guarantee that? NTP fails to guarantee that all clocks are synced inside a datacenter, let alone across an ocean (Did not read the code yet)
EDIT: The wording got me. "Guarantee" & "Perfect" in the post title, and "Millisecond-accurate synchronization" in the README. Cool project!
moomin 4 hours ago [-]
More, the speed of light puts a hard cap on how simultaneous you can be. Wolfram Alpha reckons New York to London is 19ms in a vacuum, more using fibre.
Going off on a tangent: Back in the days of Live Aid, they tried doing a transatlantic duet. Turns out it’s literally physically impossible because if A songs when they hear B, then B hears A at least 38ms too late, which is too much for the human body to handle and still make music.
recursive 51 minutes ago [-]
It's a less hard problem than the duet. If the round-trip is 38ms, you can estimate that the one-way latency is 19ms. You tell the the other client to play the audio now, and you schedule it for 19ms in the future.
That's assuming standard OS and hardware and drivers can manage latency with that degree of precision, which I have serious doubts about.
In a duet, your partner needs to hear you now and you need to hear them now. With pre-recorded audio, you can buffer into the future.
thruflo 3 hours ago [-]
This looks really cool, congrats!
Just to share a couple of similar/related projects in case useful for reference:
This is very, very cool; it's a thing I've been looking for on my backburner for several years. It's a very interesting problem.
There are a ton of directions I can think about you taking it in.
The household application: this one is already pretty directly applicable. Have a bunch of wireless speakers and you should be able to make it sound really good from anywhere, yes? You would probably want support for static configurations, and there's a good chance each client isn't going to be able to run the full suite, but the server can probably still figure out what to send to each client based on timing data.
Relatedly, it would be nice to have a sense of "facing" for the point on the virtual grid and adjust 5.1 channels accordingly, automatically (especially left/right). [Oh, maybe this is already implicit in the grid - "up" is "forward"?]
The party application: this would be a cool trick that would take a lot more work. What if each device could locate itself in actual space automatically and figure out its sync accordingly as it moved? This might not be possible purely with software - especially with just the browser's access to sensors related to high-accuracy location based on, for example, wi-fi sources. However, it would be utterly magical to be able to install an app, join a host, and let your phone join a mob of other phones as individual speakers in everyone's pockets at a party and have positional audio "just work." The "wow" factor would be off the charts.
On a related note, it could be interesting to add a "jukebox" front-end - some way for clients to submit and negotiate tracks for the play queue.
Another idea - account for copper and optical cabling. The latency issue isn't restricted to the clocks that you can see. Adjusting audio timing for long audio cable runs matters a lot in large areas (say, a stadium or performance hall) but it can still matter in house-sized settings, too, depending on how speakers are wired. For a laptop speaker, there's no practical offset between the clock's time and the time as which sound plays, but if the audio output is connected to a cable run, it would be nice - and probably not very hard - to add some static timing offset for the physical layer associated with a particular output (or even channel). It might even be worth it to be able to calculate it for the user. (This speaker is 300 feet away from its output through X meters of copper; figure out my additional latency offset for me.)
camtarn 4 hours ago [-]
> This speaker is 300 feet away from its output through X meters of copper; figure out my additional latency offset for me.
0.3 microseconds. The period of a wave at 20kHz (very roughly the highest pitch we can hear) is 50 microseconds. So - more or less insignificant.
Cable latency is basically never an issue for audio. Latency due to speed of sound in air is what you see techs at stadiums and performance halls tuning.
freemanjiang 4 hours ago [-]
Thank you for the kind words! Yeah, I think it gets a lot more complicated once you start dealing with speaker hardware. It pretty much only works for the device's native speaker at the moment.
The instant you start having wireless speakers (eg. bluetooth) or any sort of significant delay between commanding playback and the actual sound coming out, the latency becomes audible.
raisedbyninjas 50 minutes ago [-]
For devices with mics, can you have them play a test chirp to measure the latency of Bluetooth or other laggy sound stack?
hgomersall 5 hours ago [-]
Silent disco in which everyone brings their own source and headphones.
daredoes 5 hours ago [-]
Have you seen snapcast? That's currently my go-to audio sync solution for running whole house audio. Always open to alternatives, but so far nothing beats the performance and accessibility
freemanjiang 5 hours ago [-]
yes but only after posting! it's very cool—i'm actually a little embarrassed to not have seen it before.
they're doing a smarter thing by doing streaming. i don't do any streaming right now.
the upside is that beatsync works in the browser. just a link means no setup is required.
rezonant 2 hours ago [-]
It's not open source until you pick a license. Since there is no license in this repository, it is at best source-available.
Dwedit 5 hours ago [-]
How does it deal with the audio ring buffers on the various devices? Does it just try to start them all at the same time, or does it take into account the sample position within the buffer?
freemanjiang 4 hours ago [-]
Great question! There's two steps:
First, I do clock synchronization with a central server so that all clients can agree on a time reference.
Then, instead of directly manipulating the hardware audio ring buffers (which browsers don't allow), I use the Web Audio API's scheduling system to play audio in the future at a specific start time, on all devices.
So a central server relays messages from clients, telling them when to start and which sample position in the buffer to start from.
camtarn 4 hours ago [-]
Interesting. Feels like this might still have some noticeable tens-of-millisends latency on Windows, where the default audio drivers still have high latency. The browser may intend to play the sound at time t, but when it calls Windows's API to play the sound I'm guessing it doesn't apply a negative time offset?
serial_dev 4 hours ago [-]
So it doesn't need to use the microphone? I guess from the "works across the ocean" comment and based on this description. I would have thought you would listen to the mic and sync based on surrounding audio somehow but it's good to know that it's not needed.
freemanjiang 4 hours ago [-]
Yup no microphone. It's all clock sync
cosmotic 5 hours ago [-]
Another issue is seeking in compressed audio. When seeking (to sync), some API's snap to frame boundaries.
cosmotic 1 hours ago [-]
I solved this by decompressing the whole file into memory as PCM.
brcmthrowaway 5 hours ago [-]
This is my question, does it do interpolation or pitch bending
lacoolj 3 hours ago [-]
Very very cool idea, but this is a bummer: "Optimized for Chrome on macOS. Unstable for other platforms..."
Once that changes (at the very least, the macOS part), I can't wait to play with it!
freemanjiang 2 hours ago [-]
It works on other platforms! Just not as smooth as Chrome.
Groxx 1 hours ago [-]
Impressively accurate - Android phone in Firefox <-> Chrome on OSX == basically perfect to my ear. That's super cool, thanks for sharing!
maxmynter95 2 hours ago [-]
It's a really intereseting vibe when you play on multiple machines. Sometimes you can notice a slight off-ness which gives this reverb effect.
hackncheese 4 hours ago [-]
Any plans to integrate this with Apple Music or Spotify? I would assume your algorithm would work only with files uploaded to the site, but curious if you had plans to attempt something with Apple Music/Spotify
freemanjiang 2 hours ago [-]
Yes! The very next step.
5 hours ago [-]
bjackman 4 hours ago [-]
Very cool! As someone who doesn't know much about the topic, I'm surprised that "millisecond-level accuracy" is enough. I would have imagined that you need to be accurate down to some fairly small multiple of the sample rate to avoid phasing effects.
Do you have any interesting insight into that question?
cesaref 3 hours ago [-]
If you look at professional distributed audio systems (Dante, AES67 etc) you'll find that they all require PTP support on the hardware to achieve the required timing accuracy, so yes, you need <1ms to get to the point of being considered suitable if you are doing anything which involves, say, mixing multiple streams together and avoiding phasing type effects.
However, it very much depends on what your expectations are, and how critical your listening is. If no one is looking for problems, it can be made to work well enough.
freemanjiang 4 hours ago [-]
Yeah the threshold is pretty brutal, but it is enough. Experimentally, I'd say you need under 2-3ms but even at 1ms you can start to hear some phase differences.
Most of the time, I think my synchronization algorithm is actually sub-1ms, but it can be worse depending on unstable network conditions.
mkishi 2 hours ago [-]
How are you measuring this? I'm surprised the Web Audio API scheduling system has that much insight into the hardware latency.
4 hours ago [-]
hatthew 4 hours ago [-]
Sound travels at a speed of ~1 foot/millisecond
camtarn 2 hours ago [-]
Oh, that's a nice approximation! Similar to Grace Hopper's famous demo of a six inch wire being about how far electrical signals travel in a nanosecond.
jauntywundrkind 5 hours ago [-]
Unfortunately the w3c webtiming community group has closed. It'd be amazing to have the browser better able to keep time in sync across devices.
Luckily the audio industry as solved this problem, and they use PTP as the clocking mechanism for AES67 (kind of the bastard child of Ravenna and Dante, but with a fully open* AoIP protocol) that's designed for handling all the hard parts of sync'ing audio over a network. And it's used everywhere these days, but mostly in venues/stadiums/theme parks.
* open if you pay membership dues to the AES or buy the spec
jauntywundrkind 2 hours ago [-]
Hopefully wifi8 has something PTP built-in. I hear there's some vague hope that better timing info is one of the core pieces, so maybe maybe!
I'm super jazzed seeing AES67 emerge.. although it not working great over wifi for lack of proper timing info hurts. Very understandable for professional gear, but there's nothing I love more than seeing professional, prosumer and consumer gear blend together!
PipeWire already has pretty decent support! There's a tracker where people report on with their hardware experiences trying it. Some really really interesting hardware shows up here (and elsewhere on the gitlab): https://gitlab.freedesktop.org/pipewire/pipewire/-/issues/32...
js4ever 2 hours ago [-]
Love it, this is impressive and very smart, no need for mic!
HelloUsername 3 hours ago [-]
Cool! I'd swap the 'search music' (cobalt.tools) button with the 'upload audio' button
RicoElectrico 1 hours ago [-]
Does this resync periodically? (I mean not only when a new track starts)
ajb 3 hours ago [-]
That's cool!
Last I heard safari was buggy and behind on web audio - did you run into any issues there?
crunchwrapjs 5 hours ago [-]
i've been wanting to make this for so long! it's crazy that it's done completely in the browser
badmonster 3 hours ago [-]
how does it achieve millisecond-accurate multi-device audio synchronization across browsers?
Someone brought up the idea of an internet radio, which I thought was cool. If you could see a list of all the rooms people are in and tune it to exactly what they're jamming to.
How can you guarantee that? NTP fails to guarantee that all clocks are synced inside a datacenter, let alone across an ocean (Did not read the code yet)
EDIT: The wording got me. "Guarantee" & "Perfect" in the post title, and "Millisecond-accurate synchronization" in the README. Cool project!
Going off on a tangent: Back in the days of Live Aid, they tried doing a transatlantic duet. Turns out it’s literally physically impossible because if A songs when they hear B, then B hears A at least 38ms too late, which is too much for the human body to handle and still make music.
That's assuming standard OS and hardware and drivers can manage latency with that degree of precision, which I have serious doubts about.
In a duet, your partner needs to hear you now and you need to hear them now. With pre-recorded audio, you can buffer into the future.
Just to share a couple of similar/related projects in case useful for reference:
http://strobe.audio multi-room audio in Elixir
https://www.panaudia.com multi-user spatial audio mixing in Rust
There are a ton of directions I can think about you taking it in.
The household application: this one is already pretty directly applicable. Have a bunch of wireless speakers and you should be able to make it sound really good from anywhere, yes? You would probably want support for static configurations, and there's a good chance each client isn't going to be able to run the full suite, but the server can probably still figure out what to send to each client based on timing data.
Relatedly, it would be nice to have a sense of "facing" for the point on the virtual grid and adjust 5.1 channels accordingly, automatically (especially left/right). [Oh, maybe this is already implicit in the grid - "up" is "forward"?]
The party application: this would be a cool trick that would take a lot more work. What if each device could locate itself in actual space automatically and figure out its sync accordingly as it moved? This might not be possible purely with software - especially with just the browser's access to sensors related to high-accuracy location based on, for example, wi-fi sources. However, it would be utterly magical to be able to install an app, join a host, and let your phone join a mob of other phones as individual speakers in everyone's pockets at a party and have positional audio "just work." The "wow" factor would be off the charts.
On a related note, it could be interesting to add a "jukebox" front-end - some way for clients to submit and negotiate tracks for the play queue.
Another idea - account for copper and optical cabling. The latency issue isn't restricted to the clocks that you can see. Adjusting audio timing for long audio cable runs matters a lot in large areas (say, a stadium or performance hall) but it can still matter in house-sized settings, too, depending on how speakers are wired. For a laptop speaker, there's no practical offset between the clock's time and the time as which sound plays, but if the audio output is connected to a cable run, it would be nice - and probably not very hard - to add some static timing offset for the physical layer associated with a particular output (or even channel). It might even be worth it to be able to calculate it for the user. (This speaker is 300 feet away from its output through X meters of copper; figure out my additional latency offset for me.)
0.3 microseconds. The period of a wave at 20kHz (very roughly the highest pitch we can hear) is 50 microseconds. So - more or less insignificant.
Cable latency is basically never an issue for audio. Latency due to speed of sound in air is what you see techs at stadiums and performance halls tuning.
The instant you start having wireless speakers (eg. bluetooth) or any sort of significant delay between commanding playback and the actual sound coming out, the latency becomes audible.
they're doing a smarter thing by doing streaming. i don't do any streaming right now.
the upside is that beatsync works in the browser. just a link means no setup is required.
First, I do clock synchronization with a central server so that all clients can agree on a time reference.
Then, instead of directly manipulating the hardware audio ring buffers (which browsers don't allow), I use the Web Audio API's scheduling system to play audio in the future at a specific start time, on all devices.
So a central server relays messages from clients, telling them when to start and which sample position in the buffer to start from.
Once that changes (at the very least, the macOS part), I can't wait to play with it!
Do you have any interesting insight into that question?
However, it very much depends on what your expectations are, and how critical your listening is. If no one is looking for problems, it can be made to work well enough.
Most of the time, I think my synchronization algorithm is actually sub-1ms, but it can be worse depending on unstable network conditions.
https://www.w3.org/community/webtiming/
https://github.com/webtiming/timingobject
* open if you pay membership dues to the AES or buy the spec
I'm super jazzed seeing AES67 emerge.. although it not working great over wifi for lack of proper timing info hurts. Very understandable for professional gear, but there's nothing I love more than seeing professional, prosumer and consumer gear blend together!
PipeWire already has pretty decent support! There's a tracker where people report on with their hardware experiences trying it. Some really really interesting hardware shows up here (and elsewhere on the gitlab): https://gitlab.freedesktop.org/pipewire/pipewire/-/issues/32...
Last I heard safari was buggy and behind on web audio - did you run into any issues there?