numpy를 이용한 Convolution 2D 구현

Notice

Recent Posts

Recent Comments

Today

Total

관리 메뉴

numpy를 이용한 Convolution 2D 구현 본문

Python

numpy를 이용한 Convolution 2D 구현

eremo2002 2021. 9. 14. 02:02

pytorch, tensorflow 등의 딥러닝 라이브러리를 사용하지 않고 딥러닝 모델을, backpropagation weight update까지 수행하는 파이프라인을 만들기 위해 필요한 내용들을 정리하고자 함

여기선 python numpy를 이용하여 Convolution 2D 연산을 수행하는 함수를 구현한다.

pytorch, tensorflow의 Conv2D를 보면 인자로 받는 값들이 굉장히 많은데 우선은 kernel, padding, stride 정도만 사용하기로 함

입력으로 사용할 (3, 5, 5) 크기의 numpy array를 생성한다. (channel-first format)

output은 Conv2D 함수의 리턴 값으로 Conv2D 함수의 인자로는 image, out_channel, kernel, padding, strides를 받는다.

image : 입력 array
out_channels : output array의 채널 크기 = kernel 개수
kernel : kernel 크기
padding : zero-padding
strides : strides

image = np.random.random((3, 5, 5))
output = Conv2D(image, 6, (3, 3), padding=1, strides=1)

def Conv2D(image, out_channels, kernel, padding=0, strides=1):
	pass

output array의 사이즈를 계산하기 위해 입력 image의 channel, height, width 정보를 각 변수에 저장
(image_channel, kernel[0], kernel[1]) 크기의 kernel을 생성한다. kernel의 channel은 입력 feature의 channel과 동일해야함
convolution연산의 output 크기를 사전에 계산하여 해당 크기에 맞는 zero array를 생성해둔다.

def Conv2D(image, out_channels, kernel, padding=0, strides=1):
    '''
    input = (C, H, W)
    kernel = (k, k)
    ouptput = (out_channels, output_height, output_width)
    '''
    image_channel, image_height, image_width = image.shape[0], image.shape[1], image.shape[2]

    kernel_channel, kernel_height, kernel_width = image_channel, kernel[0], kernel[1]
    kernel = np.random.random((image_channel, kernel_height, kernel_width))    
    
    output_height = int(((image_height - kernel_height + 2 * padding) / strides) + 1)
    output_width= int(((image_width - kernel_width + 2 * padding) / strides) + 1)
    output_channel = out_channel
    
    output = np.zeros((output_channels, output_height, output_width))

zero padding을 사용하는 경우, zero padding이 추가된 사이즈에 해당하는 imagePadded를 생성
imagePadded array에서 zero 값이 되어야 하는 부분을 제외한 나머지는 부분은 입력 image array의 값으로 할당

if padding != 0:
        imagePadded = np.zeros((image_channel, image_height + padding * 2, image_width + padding * 2))
        imagePadded[:, padding:(-1*padding), padding:(-1*padding)] = image

반복문을 통해 convolution 연산을 수행
convolution feature map은 filter 개수만큼 생성되므로 channel 축까지 고려하여 3중 for문을 사용함
한 필터마다 생성하는 convolution 결과를 output_per_channel에 저장하고 그 값을 output[z, :, :] = output_per_channel로 output array의 각 채널에 할당함
sliding window 과정 중 연산 범위를 벗어나지 않도록 for문 내 if문 추가

for z in range(0, output_channel):
        output_per_channel = np.zeros((output_height, output_width))
        
        for y in range(0, output_height):
            if (y*strides + kernel_height) <= imagePadded.shape[1]:

                for x in range(0, output_width):                
                    if (x*strides + kernel_width) <= imagePadded.shape[2]:
                        output_per_channel[y][x] = np.sum(imagePadded[:,
                                                               y*strides : y*strides + kernel_height,
                                                               x*strides : x*strides + kernel_width] * kernel).astype(np.float32)
        output[z, :, :] = output_per_channel

실행 예제 코드

'''
impelement Convolution 2D using numpy

- used channel-first format => (Channel, Height, Width)

references
https://datascience-enthusiast.com/DL/Convolution_model_Step_by_Stepv2.html
https://medium.com/analytics-vidhya/2d-convolution-using-python-numpy-43442ff5f381
'''

import numpy as np
np.set_printoptions(linewidth=np.inf)

def Conv2D(image, out_channels, kernel, padding=0, strides=1):
    '''
    input = (C, H, W)
    kernel = (k, k)
    ouptput = (out_channels, output_height, output_width)
    '''
    image_channel, image_height, image_width = image.shape[0], image.shape[1], image.shape[2]

    kernel_channel, kernel_height, kernel_width = image_channel, kernel[0], kernel[1]
    kernel = np.random.random((image_channel, kernel_height, kernel_width))    
    
    output_height = int(((image_height - kernel_height + 2 * padding) / strides) + 1)
    output_width= int(((image_width - kernel_width + 2 * padding) / strides) + 1)
    output_channel = out_channels
    
    output = np.zeros((output_channel, output_height, output_width))

    # create zero-padded input
    if padding != 0:
        imagePadded = np.zeros((image_channel, image_height + padding * 2, image_width + padding * 2))
        imagePadded[:, padding:(-1*padding), padding:(-1*padding)] = image
    
    print('='*50)
    print('imagePadded')
    print(f'imagepadded shape : {imagePadded.shape}')
    print(imagePadded)

    # convolution 2D
    for z in range(0, output_channel):
        output_per_channel = np.zeros((output_height, output_width))
        
        for y in range(0, output_height):
            if (y*strides + kernel_height) <= imagePadded.shape[1]:

                for x in range(0, output_width):                
                    if (x*strides + kernel_width) <= imagePadded.shape[2]:
                        output_per_channel[y][x] = np.sum(imagePadded[:,
                                                               y*strides : y*strides + kernel_height,
                                                               x*strides : x*strides + kernel_width] * kernel).astype(np.float32)
        output[z, :, :] = output_per_channel
    
    print('='*50)
    print('output')
    print(f'output shape : {output.shape}')
    print(output)
    print('='*50)

    return output


image = np.random.random((3, 5, 5))
output = Conv2D(image, 6, (3, 3), padding=1, strides=1)

저작자표시 비영리 동일조건

Comments

numpy를 이용한 Convolution 2D 구현 본문

numpy를 이용한 Convolution 2D 구현

티스토리툴바