PhonicMind - an online service that can extract / remove vocals from sound sources

We introduce PhonicMind, an online service that extracts / removes only vocals from the sound source, and other similar software / services.

What is PhonicMind?

PhonicMind is an online service that can automatically extract and remove vocals from 2-mix sound sources.

When uploading the sound source, you can download two sound sources, the sound source which extracted only the vocal from the sound source and the sound source which removed only the vocal.

PhonicMind official website

Phonic Mind's reputation

Looking at the response of the link below, PhonicMind's reputation seems to be good.

PhonicMind, a vocal remover that actually works when it comes to isolating the vocals. HIGHLY RECOMMENDED! from makingvaporwave

Has anyone tried PhonicMind? from IsolatedVocals

How PhonicMind works

PhonicMind seems to be using deep neural net.

PhonicMind’s vocal remover uses deep neural networks to do vocal elimination.


Other vocal removal / extraction software

Vocal Remover (

Like PhonicMind, VocalRemover is an online service that can automatically remove / extract vocals from 2-mix sound sources.

You can remove vocals from "Vocal Remover" in the left menu and vocal extraction from "Vocal Extractor".

Vocal Remover

When I tried it, the band is near the vocal and the sound with the localization in the center is extracted together with the vocal. Also, vocal removal has reverb components left.

As you can imagine, I think that you are using the traditional method of extracting vocals using information of localization, frequency band, transient.

VocalRemover (

Like PhonicMind, VocalRemover is an online service that can automatically remove / extract vocals from 2-mix sound sources.


I tried it, the quality is higher than The reverb component after vocal removal is also weak. I felt the same quality as PhonicMind.

Lakeside Audio Isola Pro FX

Lakeside Audio Isola Pro FX is a VST plug-in that can extract various instruments semi-automatically from 2-mix sound sources.

Lakeside Audio Isola Pro FX

There are "MIDI mode" giving hints at Midi and automatic mode to specify the frequency band, and "MIDI mode" seems to have higher quality. Because it is VST, it can process in real time.

It is a comparison video of PhonicMind and Lakeside Audio Isola Prox FX. A little artifact has appeared, but I felt the same quality as PhonicMind.

iZotope RX 7

iZotope RX 7 is a standalone software for repairing and adjusting 2 - mix sound sources, which supports both music production and post production.

iZotope RX 7

The initial version of iZotope RX was announced in 2007, with the release of RX 7 in 2018, the function of automatically extracting vocals, bass, percussion etc from 2 - mix sound source and readjusting the volume was added.

According to the following information, it seems that neural net is used for sound source separation algorithm.

The evolution of our intelligent audio technology continues with the Music Rebalance module in RX 7. Music Rebalance is a new tool that gives users the ability to boost, attenuate, or even isolate musical elements from audio recordings. It is a natural progression of our neural network-based source separation technology, first introduced in the forms of Dialogue Isolate and De-rustle in RX 6 and now evolved to extract multiple musical components from complex mixes.


Audionamix XTRAX STEMS

Audionamix XTRAX STEMS is a standalone software that fully separates 2-mix sound sources into three vocals, drums, and other instruments.

Audionamix XTRAX STEMS

According to this information, it seems that you are using a neural network and it is superior to ADX TRAX.

Audionamix ADX TRAX

Audionamix ADX TRAX is a standalone software that extracts vocals. We are specialized in vocal extraction. Unlike PhonicMind, you can fine-tune manually while watching the spectrum.

Audionamix ADX TRAX


BlueLab REBALANCE is a VST that allows you to adjust the volume of each 2 - mix sound source divided into 4 vocals, bass, drums, and other instruments. It is released in January 2019. It is VST so it can be processed in real time.

I tried it, it was lower quality than PhonicMind. I guess, I guess you are using a traditional algorithm.


Which vocal removal / extraction service should be used?

I want to make a karaoke sound source

I think PhonicMind or VocalRemover is good.

Because it is a web service, it is unnecessary to install software and other troubles.

I want to copy (transcription) my ear

I think that Lakeside Audio Isola Pro FX, iZotope RX 7 or Audionamix XTRAX STEMS is good.

Musical instruments other than vocals can also be extracted. I do not know which of these is better because I do not use it.


We introduced the software / service for extracting and removing vocals on behalf of PhonicMind.

Share of DAW

DAW Share

I examined the share of DAW. Global, Ableton Live, Logic Pro are popular, and in Japan it turned out that Cubase, Studio One is popular.

Share of DAW

DAW Share
It summarized the share of DAW. Based on the questionnaire result of in 2015 and 2018 and the questionnaire result of in 2016 and 2017.

DAW's Twitter followers

Twitter followers of DAW developers

I compiled the number of followers for DAW's Twitter account.

If there is a Twitter account of DAW, we have plotted the number of followers of DAW's Twitter account, the number of followers of company's Twitter account if there is no twitter account of DAW and Twitter account of DAW, 0 if neither. Please be aware that because the company's Twitter account tends to have more followers than DAW's Twitter account, it is not necessarily popular because there are many followers.

DAW release year

Initial Release Date of DAW

I summarized the year when DAW was released.

Many companies have made Midi Sequencer as the predecessor of DAW, but in that case the release year of Midi Sequencer is the release year. Digital Performer is making the score production software "Professional Composer" in 1984 as the predecessor of Midi Sequencer, but this is not included in the initial release.

Share consideration of DAW

Cubase and Studio One are popular in Japan

According to the questionnaire result of, Cubase and Studio One are popular in Japan. I was expecting that Cubase and Studio One would be more sensuous in Japan, but the results were exactly as expected.

Ableton Live is popular globally

According to the questionnaire result of, Ableton Live is popular globally. Although this result did not come to the pin sensually, as a result of seeing the number of followers of Twitter and SimilarWeb data, Ableton Live seems to be popular indeed. ※ Similar WEB data are not posted.

Why is popular DAW different between Japan and global?

There is a possibility that it is making a difference between the global and Japanese DAW share whether it is focusing on Japanese marketing or focusing on Japanese marketing. Differences in the popular music may be one of the causes.

Reason for Ableton Live being the most popular globally

I tried using Ableton Live

I actually tried using Ableton Live to explore the reason why Ableton Live gained popularity globally.

As a result, compared to Cubase, Studio One, FL Studio, no big difference was found. Places where patterns can be placed are close to FL Studio, piano rolls and mixers are close to Cubase and Studio One. It may be that you do not know the difference unless you use it for a long time.

Click here for Ableton Live trial version

Ableton Live User's Voice

Ableton Live is good at audio

Not the best when it comes to dealing with actual audio clips, as opposed to MIDI. Still very capable, but a little more scattered and convoluted than Ableton and Logic.

Honestly? I use FL because I torrented a copy in high school, and got used to the workflow. I did eventually buy a copy because I’m not a douche. I use ableton as well now, because it’s much better for working with audio clips and live performance.


Ableton Live seems to be more good at audio than FL Studio.

Ableton Live takes time to master

Ableton is a bit more expensive and might be more harder to learn, FL is less expensive and it’s lifetime + kind if beginner friendly at first.


Haha that’s the normal first reaction to opening Ableton — “wtf is this”


Ableton Live seems to take time to learn.

Ableton Live takes less time to master

I really dislike the fact that people call FL beginner friendly compared to Ableton. Having used both for years, Ableton is so much easier to get into. Could be just me tho.


There is also opinion that Ableton Live is actually easy to use.

Ableton Live is suitable for hip-hop production

Now that being said….Ableton….w/Push 2 is the goat DAW if you are into sample based production…specifically chopping samples…real hip hoppy stuff. My beef with Ableton is its archaic windows 98 looking interface.


Using Ableton Live with Ableton Push 2 seems good for hip-hop production. There is a possibility that Ableton Live is popular in the hip hop popular in the United States under the trend.

Ableton Live users are friendly

Ableton user forums are generally quite helpful and the community will treat a newcomer with some respect. You better put on your bulletproof vest when using the FL Studio user forums because I’ve never seen that many arrogant pricks in one place in all my life! They’d rather ridicule you than answer your simple question (even the Image-Line employees!). Thankfully I know the software pretty well now so I avoid their forums at all costs.


Ableton Live users seem to be more friendly than FL Studio users.

Ableton Live encourages artist growth

What makes Ableton makes so good is number 1 its relationships with simple and complex ways of doing things, it has a germen quality about it t, in other words, the UI, the look and feel is very high, enabling high flow ie its easy to “waste” hours playing with sounds, there are crucial in becoming a better producer, a weekend playing and making sounds rather than “songs” is more valuable in learning, than a 6 month class (or do both)h.


Since Ableton Live is used naturally for a long time, there is an opinion that it is easy to grow as an artist.

DAW used by top artists

If you are a longing artist using Ableton Live, Ableton Live may become popular.

Marshmello: Ableton Live, FL Studio, Logic Pro

According to this information Marshmello may be using Ableton Live, FL Studio, Logic Pro.

Travis Scott: FL Studio

According to this information Travis Scott may be using FL Studio.

Other artists

According to this information, Calvin Harris, David Guetta, Armin van Buuren, Skrillex, Zedd seem to be using Pro Tools, FL Studio, Ableton Live, Logic Pro, Cubase, Studio One.


DAW questionnaire

Questionnaire results

DAW's Twitter

Ableton Twitter:
Pro Tools Twitter:
Steinberg Twitter:
FL Studio Twitter:
PreSonus Twitter:
Propellerhead Software Twitter:
Cockos Twitter:
Cakewalk Inc. Twitter:
Bitwig Twitter:
MOTU Twitter:

DAW release year

Ableton Live Initial Release Date:
Logic Pro Initial Release Date:
Pro Tools Initial Release Date:
Cubase Initial Release Date:
FL Studio Initial Release Date:
Studio One Initial Release Date:
Reason Initial Release Date:
Reaper Initial Release Date:
Sonar Initial Release Date:
Garage Band Initial Release Date:
Bitwig Studio Initial Release Date:
Digital Performer Initial Release Date:


I examined the share of DAW.

I found that the popular DAW is different between the global and Japan. Global, Ableton Live, Logic Pro are popular, in Japan Cubase, Studio One is popular. There are many reasons for that, but I did not know what was clear.

List of Publications


2018/12/20 "MusicTech"

MinusDelay was shipped on the DVD.

"MusicTech Magazine Issue 190: Gear Of The Year 2018"

2018/12/11 "FINDERS"

The smartphone version of AI Mastering was introduced.

"Audio auto mastering application" Sound pressure blow-up kun "dedicated to all" smart video creator "

2018/11/11 ""

MinusDelay was introduced.

"Best free plug-ins this week: Snare Designer, Minus Delay & DeEss"

2018/11/06 "Computer Music Japan"

ClearMixer was introduced.

"Plug-in that automatically reduces band fog with one touch, release BAKUAGE" ClearMixer "

2018/10/28 "Computer Music Japan"

MinusDelay was introduced.

"【Free】 Plugin to speed up pronunciation timing, Bakuage" MinusDelay "free distribution started! "

ClearMixer v1.2.0 has been released. Supports 32 tracks


ClearMixer v1.2.0 has been released. I increased the number of tracks from 16 tracks to 32 tracks.

What is ClearMixer?

ClearMixer is a VST mixer plug-in that automatically reduces masking between instruments. It is ideal for those who want to easily perform mix with less interference.

Download the latest version

Demo Version

Product version

Buy product version

* Update method: Run install.bat and it is OK.

* Please see the bundled README for usage.

Update content (v1.2.0)

I increased the number of tracks from 16 tracks to 32 tracks.

32 track version (latest version)

ClearMixer v1.2.0 32 Track ClearMixerSender v1.2.0 32 Track

16 track version (old version)

ClearMixer 16 Track

Added loudness optimization option for YouTube

Estimated Weighting Curve Used for YouTube Loudness Normalization

In accordance with YouTube 's loudness normalization specification, an option to optimize sound pressure has been added to AI Mastering's custom mastering.

"Target sound pressure mode" option

Specify the reference calculation method of the target sound pressure.

target loudness mode option

Loudness (conventional operation)

Limits the loudness defined by ITU-R BS.1770 to match the target sound pressure. It is the same behavior as before.

YouTube Loudness

Limit so that the approximate value of loudness used in YouTube loudness normalization matches the target sound pressure. Approximate formulas are based on the YouTube loudness normalization algorithm survey .

According to the survey, since the reference value of the loudness normalization is -10.3 dB, considering that the actual sound pressure slightly decreases from the target sound pressure, if the target sound pressure is set to about -9 dB, the loudness normalization does not work or it does not work You can set it to the very last sound pressure.

How to optimize loudness for YouTube?

Together with the "Ceiling" option added here , mastering with the settings below, you can perform mastering optimized for YouTube.

"Target sound pressure mode": "YouTube loudness"

"Target sound pressure": "-9 dB"

"Ceiling mode": "True Peak (15 kHz Lowpass)"

"Ceiling": "-0.5 dB"

※ The YouTube loudness normalization reference value is -10.3 dB and the target sound pressure is -9 dB because the actual sound pressure slightly drops from the target sound pressure.

How to make it operate like a conventional one?

If you make the following settings, the conventional operation will be performed.

"Target sound pressure standard": "Loudness"

"Ceiling" option added to "AI Mastering"

Although I was limiting on the basis of the True peak (intersample peak) in the recent update of AI Mastering, there are cases where it is desirable to limit on the basis of ordinary peaks depending on the application.

I added the "Ceiling" option for that to the "Custom Mastering" of AI Mastering.

"Ceiling" option

Specify the maximum output level. It is the same as Ceiling of the general limiter VST plug-in. 0 dBFS is the maximum. If encoding is required in later processing, it is possible to prevent clipping due to encoding by making it a little smaller value.

"Ceiling mode" option

Specify the reference calculation method of the maximum output level. It is the same as the oversampling option and True Peak (intersample peak) option of the general limiter VST plug-in.


Peak is the so-called normal peak and is based on the maximum value of the amplitude of the discrete waveform.

True Peak

True Peak is an intersampled peak with reference to the maximum value of the amplitude after converting discrete waveforms into continuous waveforms.

True Peak (15 kHz Lowpass)

True Peak (15 kHz Lowpass) is based on the intersample peak after low-pass filtering at 15 kHz. You can simulate the peak change due to YouTube's re-encoding, so it's best for making sounds for YouTube.

How to make it operate like a conventional one?

By setting as follows, the same operation as before can be done.

"Ceiling": 0 dBFS

"Ceiling mode": "Peak"

AI Mastering has been updated

AI Mastering has been updated.

I lost the one-touch mastering setting

In order to make it easy to use AI Mastering, I lost the one touch mastering setting. Please use custom masting when you use the option to save bass and use movie title option.

Target sound pressure of one touch mastering has been lowered

We reduced the target sound pressure of one touch mastering to -9 dB so that you can master safely in a wide range of applications such as YouTube videos.

Those who ask for high sound pressure please use custom mastering.

We deleted less frequently used functions

Delete "Depth plus" function and "Lift up" function. Please let me know if you need to revive.

Integrated preset mastering into custom mastering

To simplify the UI, we integrated preset mastering into custom mastering.

Added "YouTube Loudness Correction" indicator

The "YouTube Loudness Correction" metric is an estimate of how much loudness is compensated by loudness normalization when uploading to YouTube.

When uploading to YouTube it is good to avoid this value becoming too small.

We calculate based on the results of this survey .

Added "True peak" indicator

The "True Peak" index is the inter sample peak.

The "True peak (15 kHz low pass)" index is the inter sample peak of the waveform after applying a low pass filter of 15 kHz.

On some video platforms such as YouTube, uploaded videos are re-encoded. At that time the waveform changes. The big influence is the low pass filter applied when re-encoding.

According to here , the cutoff frequency of the low pass filter applied in YouTube re-encoding is 15.1 kHz, 15.8 kHz, 18 kHz, 20 kHz.

If the "True peak (15 kHz low pass)" index is less than 0 dBFS, there is less chance of clipping when uploading to YouTube.

I made it possible to have a head room at the peak

Until now, the peak of the sound source after mastering was adjusted to 0 dBFS, but it is easy to clip when re-encoding the sound source.

It was more convenient to handle the sound source without worrying about clipping, so I made it a little room for the peak. Specifically, we set "True peak (15 kHz low pass)" to be less than -0.5 dBFS.

What is Shazam? - Apps that can search for songs

We introduce the application "Shazam" that can search for songs.

What is Shazam?

It is an application that can search for songs with the sound picked up from the microphone.

Install Shazam

You can install from the link below.


How to use Shazam

When you start Shazam, you will see the following screen.


Tap the middle button to display the screen below, pick up the sound from the microphone and search for songs.

Shazam Detect

Wait a while and the song will be identified.

This is the basic usage. Alternatively, you can tap My Shazam to see the history of songs you searched in the past.

Shazam's principle

According to the following, Shazam seems to be implemented with the technology called Acoustic fingerprint.

Robust Landmark-Based Audio Fingerprinting

Acoustic fingerprint - Wikipedia


We introduced an application "Shazam" that can search for songs.

We analyzed the loudness of YouTube videos in Japan

Japanese YouTube Loudness Histogram

We analyzed the sound pressure of the YouTube video in Japan .

On YouTube, loudness normalization is introduced. Loudness normalization on YouTube lowers the volume of the video whose sound pressure is too high, but does not raise the volume of the movie whose sound pressure is too low.

If you have lots of videos with too little sound pressure on YouTube, you can increase the volume played on YouTube by raising the sound pressure to the extent that sound quality is not impaired.


How much movie is too small sound pressure on YouTube?

As mentioned above.

Does the sound pressure drop if the YouTube video length is long?

As the length increases, the probability that the peak of the waveform will increase by chance will increase. If you use a limiter, you can suppress those peaks, but if you do not use a limiter, the sound pressure may decrease as the length increases.

From this, if you examine the relationship between length and loudness, it seems to be a material to judge whether the limiter is used or not. There are other factors related to length and loudness, so we can not make a judgment alone, but once we examine the relationship between length and sound pressure.

Analysis target video

A. Japan's Top YouTuber

Almost all videos of "Hajime" channel

B. Japanese music channels

Almost all videos of "Lantis Channel" channel

C. Japanese TV station channel

Almost all videos of "AbemaTV Official YouTube" channel

* Videos before 2018/12/8

※ Because almost all of the movies have failed analysis

※ The video list is described in Appendix

Analysis measures


It is an indicator of the sound pressure. Calculated with ITU-R BS.1770-3.

Loudness Range

It is an indicator of dynamic range. We calculated and calculated the window length and overlap length of EBU TECH 3342. Window length 0.4 s, overlap length 0.3 s.

Analysis Result


Loudness histogram

Japanese YouTube Loudness Histogram

Loudness cumulative density distribution

Japanese YouTube Loudness CDF

Loudness time series

Japanese YouTube Loudness Time series

Loudness Range

Loudness range histogram

Japanese YouTube Loudness Range Histogram

Loudness range cumulative density distribution

Japanese YouTube Loudness Range CDF

Loudness range time series

Japanese YouTube Loudness Range Time Series

Relationship between loudness and loudness range

Loudness vs Loudness range scatter plot

YouTube Loudness vs Loudness Range

Loudness vs. Loudness average and standard deviation

Japanese YouTube Loudness vs Loudness Range error bar


Length Histogram

Japanese YouTube Length Histogram

Length cumulative density distribution

Japanese YouTube Length CDF

Length time series

Japanese YouTube Length Time series

Relationship between length and loudness

Length vs Loudness Scatter plot

Japanese YouTube Length vs Loudness

Length vs Loudness average and standard deviation

Japanese YouTube Length vs Loudness error bar


How much movie is too small sound pressure on YouTube?

Looking at "Loudness cumulative density distribution", except for "Lantis", the loudness of 90% or more of the video is -14 dB or less.

Since the loudness calculation formula used this time is different from that of YouTube, I can not show the line that loudness normalization is applied any more, but if I refer here , I think that -14 dB is sufficiently small, so sound pressure other than Lantis If you raise it, there seems to be a possibility that the volume at the time of playing on YouTube will rise.

Does the sound pressure drop if the YouTube video length is long?

According to "Loudness vs Length Average and Standard Deviation", such a fact is unlikely.


YouTube video analysis result in Japan (tsv)


I analyzed the sound pressure of the YouTube video in Japan.

"Haruan" is a Japanese YouTuber who is fully compliant with the loudness normalization!

" Haruan " may be YouTuber which is fully compliant with loudness normalization .

Loudness normalization fully compliant What is YouTuber?

YouTuber understands YouTube 's loudness normalization specification and uses it well .

What is Loudness Normalization?

YouTube automatically adjusts the volume between videos.

Have you seen "detailed statistics" from the right click menu of YouTube videos?

youtube video stats

Please pay attention to "Content loudness" in this. This value is a reference when YouTube performs loudness normalization.

If "Content loudness" is positive, YouTube will lower the volume. If it is minus, the volume remains unchanged.

Two facts about sound

To make good use of loudness normalization, the following two facts are important.

A. There is a trade-off relationship between sound pressure and sound quality

B. Larger volume sounds better

How to make good use of loudness normalization

On YouTube, since loudness normalization works, there are points where the sound volume does not rise even if the sound pressure is raised . It is the point where the above "Content loudness" becomes 0.

Since the sound pressure and sound quality are in a trade-off relationship, increasing the sound pressure so that the "Content loudness" becomes 0 or more on YouTube, the volume does not rise, so the larger the volume, the better the sound is heard Instead, only the sound quality goes down .

So, the best solution when raising a video on YouTube is to set "Content loudness" near 0 or below 0 .

"Harukanaru" video content loudness "

Please take a look at "Content loudness" of recent Hara video. I think that every video is near 0 dB .

Video of 2018/12/02 ("Content loudness" 0.0 dB)

Video of September 23, 2018 ("Content loudness" 0.0 dB)

In other words, Haruan's movie is the best balance of sound pressure and sound quality on YouTube. It is unusual for it to happen unintentionally, so it may be intentionally done.

However, in the video before a while, "Content loudness" deviates from 0 dB .

2018/05/25 movie ("Content loudness" -1.7 dB)

Video of 2017/11/12 ("Content loudness" - 8.5 dB)

Recently, it may correspond to the loudness normalization.


It was an article that YouTuber " Haruan " might be fully compliant with the loudness normalization .

For YouTuber applications and news applications, YouTube's loudness normalization criteria are sufficiently low, so it may be natural to respond like SEO from now on .