pyaudio

2021. 6. 21. 13:12

실시간 음향 처리를 위한 python module 사용법

참조한 youtube link

https://www.youtube.com/watch?v=AShHJdSIxkY

PyAudio는 cross-platform 오디오 I/O 라이브러리인 PortAudio에 대한 Python 바인딩을 제공한다.

PyAudio는 Python을 통해 Linux, Windows, Mac OS와 같은 다양한 플랫폼에서 오디오를 쉽게 재생/녹음할 수 있는 기능을 제공한다.

http://people.csail.mit.edu/hubert/pyaudio/

PyAudio: PortAudio v19 Python Bindings

PyAudio PyAudio provides Python bindings for PortAudio, the cross-platform audio I/O library. With PyAudio, you can easily use Python to play and record audio on a variety of platforms, such as GNU/Linux, Microsoft Windows, and Apple Mac OS X / macOS. PyAu

people.csail.mit.edu

CHUNK : the (arbitrarily chosen) number of frames the (potentially very long) signals are split into in this example

rate : Sampling rate

Channels : number of channels

format : Sampling size and format

input : Specifies whether this is an input stream

output : Specifies whether this is an output stream

PyAudio와 librosa 같이 설치하기

Anaconda install

conda env 생성 후,

pip install librosa

conda install pyaudio

https://stackoverflow.com/questions/33513522/when-installing-pyaudio-pip-cannot-find-portaudio-h-in-usr-local-include

when installing pyaudio, pip cannot find portaudio.h in /usr/local/include

I'm using mac osx 10.10 As the PyAudio Homepage said, I install the PyAudio using brew install portaudio pip install pyaudio the installation of portaudio seems successful, I can find headers an...

stackoverflow.com

본 과제에서 요구하는 기능은 풍물/사물놀이 음향 데이터(form 마이크)를 받았을 때 harmonic과 percussive를 나누어 이를 활용하여 드론을 제어하는 것

컴퓨터에 연결된 blue yeti 마이크를 input channel로 audio stream을 받아서 이를 실시간으로 처리함

관련 코드

class AudioHandler(object):


    def __init__(self):
        self.FORMAT = pyaudio.paFloat32
        self.CHANNELS = 1
        self.RATE = 44100
        self.CHUNK = 1024 * 2
        self.p = None
        self.stream = None
        self.WAVE_OUTPUT_FILENAME = "output.wave"
        self.RECORD_SECONDS = 5
        self.frames = []
        self.data = []
        self.arr = np.array([])
        self.recording = True


    def start(self):
        self.p = pyaudio.PyAudio()
        self.stream = self.p.open(format=self.FORMAT,
                                  channels=self.CHANNELS,
                                  rate=self.RATE,
                                  input=True,
                                  output=False,
                                  stream_callback=self.callback,
                                  frames_per_buffer=self.CHUNK)

audio stream을 실시간으로 처리하기 위하여 stream_callback 활용, callback 함수를 call함

    def callback(self, in_data, frame_count, time_info, flag):
        numpy_array = np.frombuffer(in_data, dtype=np.float32)
        y_harmonic, y_percussive = librosa.effects.hpss(numpy_array)
        self.arr = np.append(self.arr,y_percussive)
        print(sys.getsizeof(self.arr))

해당 callback 함수 내 librosa 라이브러리를 활용하여 percussive audio data 구분

librosa.effects.hpss의 input은 numpy array를 받으며, output 또한 numpy array(harmonic / percussive)

np.append는 numpy array의 append

현재는 mainloop에서 특정 key 입력을 받으면 append된 numpy array를 wav 파일로서 출력하는 형태로 개발함

이 때 사용한 라이브러리는 soundfile

sf.write('output_percussive.wav',self.arr,self.RATE)

--> 마이크로부터 실시간으로 들어오는 audio stream을 time window로 구분, 구분된 stream의 평균 audio 세기에 따라 드론이 움직이는 함수 추가 개발 예정

2021.06.30. 추가

Real-time audio segmentation 관련 참조

https://github.com/tyiannak/pyAudioAnalysis

tyiannak/pyAudioAnalysis

Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications - tyiannak/pyAudioAnalysis

github.com

참조 블로그

https://hyongdoc.tistory.com/400?category=884319

[파이썬으로 음성데이터 분석하기] 소리와 데이터의 형태

이번에는 음성 데이터를 분석하는 방법에 대해 다뤄보겠습니다. 아직 공부중이긴 합니다만, 지금까지 진행한 내역들에 대해 초심자의 마음으로 서술하고자 합니다. 혹시 잘못된 내용이 있으면

hyongdoc.tistory.com

https://dacon.io/competitions/official/235616/codeshare/1305?page=1&dtype=recent

음성 중첩 데이터 분류 AI 경진대회

출처 : DACON - Data Science Competition

dacon.io

https://sanghyu.tistory.com/45

MFCC(Mel Frequency Cepstrum Coefficient)의 python구현과 의미

MFCC의 python 구현 python의 librosa 라이브러리를 이용해 쉽게 구현할 수 있다. import matplotlib.pyplot as plt import librosa.display import librosa import numpy as np path = 'sample1.wav' sample_rate..

sanghyu.tistory.com

사물놀이의 사운드 분석을 통한 시공간적 표현에 관한 연구

사물놀이의 사운드 분석을 통한 시공간적 표현에 관한 연구 An Study on Visual Expression by Analysis of Analysis of Samulnori Sound 참고문헌(0) * 2019년 이후 발행 논문의 참고문헌은 현재 구축 중입니다. KCI에서

www.kci.go.kr

https://www.dbpia.co.kr/Journal/articleDetail?nodeId=NODE00536716

[특집] 사물놀이 악기소리와 인간의 목소리 주파수 대역

논문, 학술저널 검색 플랫폼 서비스

www.dbpia.co.kr

Matplot for display spectrum

https://stackoverflow.com/questions/19181165/matplotlib-figure-as-a-new-class-attribute-how-to-show-it-later-on-command

Matplotlib figure as a new class attribute. How to show it later, on command?

I'm using Python 3 to analyze data from experiments. For that I created a Data class with load and fit methods and what I'd like to accomplish is that both methods define (or redefine) the attribut...

stackoverflow.com

https://gist.github.com/sshh12/62c740b329229c7292f2a7b520b0b6f3

Live mic -> live melspectrogram plot

Live mic -> live melspectrogram plot. GitHub Gist: instantly share code, notes, and snippets.

gist.github.com

MFCC 관련 설명 블로그

* https://brightwon.tistory.com/11

MFCC(Mel-Frequency Cepstral Coefficient) 이해하기

이 글은 음성/음악 등 오디오 신호 처리 분야에서 널리 쓰이는 특징값(Feature) 중 하나인 MFCC(Mel-Frequency Cepstral Coefficient)에 대해 정리한 글입니다. 알고리즘 구현보다는 MFCC의 전반적인 이해와

brightwon.tistory.com

http://keunwoochoi.blogspot.com/2016/03/2.html

음성/음악신호+머신러닝 초심자를 위한 가이드 [2편]

최근우 연구 관련 블로그.

keunwoochoi.blogspot.com

https://hyunlee103.tistory.com/45

오디오 데이터 전처리 (3) Cepstrum Analysis

오디오 데이터 전처리 (2)에서 이어지는 글입니다. 2편에서는 waveform에 푸리에 변환을 통해 spectrum을 뽑고, 각 frame을 옆으로 쌓아 시간 정보를 살려주는 spectrogram에 대해 알아봤습니다. 3편에서는

hyunlee103.tistory.com

'Research > Projects' 카테고리의 다른 글

30주년 연구 중간점검과 간단한 뒤풀이 (0)	2021.08.27
Audio pitch estimation (0)	2021.07.19

Research and Investment

pyaudio

'Research > Projects' 카테고리의 다른 글

+ Recent posts

티스토리툴바