안녕하세요! 현재 파이토치 공식 튜토리얼에서 제공하는 CelebA 이미지 데이터를 활용해서 DCGAN 모델을 학습하고 있는데요! 튜토리얼 코드와 거의 동일하게 아래처럼 작성하였는데, 이상하게 학습이 잘 안됩니다 ㅜ 모델 구조와 학습률 같은 파라미터들도 모두 동일하게 하였는데, 원인이 무엇인지 모르겠네요.. Loss 출력 값도 튜토리얼에서 나온 출력값과 매우 다르게 나오고.. 판별자 성능이 매우 안좋아짐에 따라 생성자 Loss 값도 0으로 수렴되어 좋아지는 것처럼 보이지만 실제 학습 중간에 생성된 이미지 샘플을 출력해보면 그냥 여전히 노이즈 이미지입니다.
어떤 부분이 문제가 되는 걸까요? 데이터 로드, 모델 구조, 학습 코드 전문을 첨가합니다! 고수 분들의 피드백 부탁드립니다!
import torch
import torch.nn as nn
from torch.optim import Adam
from torchvision import datasets
from torch.utils.data import DataLoader
import torchvision.transforms as transforms
import torchvision.utils as vutils
nz = 100
ngf = 64
ndf = 64
nc = 3
lr = 0.0002
beta1 = 0.5
image_size = 64
BATCH_SIZE = 128
transforms = transforms.Compose([
transforms.Resize(image_size),
transforms.CenterCrop(image_size),
transforms.ToTensor(),
transforms.Normalize(mean=(0.5,0.5,0.5), # R,G,B
std=(0.5,0.5,0.5)) # R,G,B. channel별 (x-mean)/std
])
dataset = datasets.CelebA(root="dataset", split="train", transform=transforms, download=True)
dataloader = DataLoader(dataset, batch_size=BATCH_SIZE, shuffle=False)
def weights_init(m):
classname = m.__class__.__name__
if classname.find('Conv') != -1:
nn.init.normal_(m.weight.data, 0.0, 0.02)
elif classname.find('BatchNorm') != -1:
nn.init.normal_(m.weight.data, 1.0, 0.02)
nn.init.constant_(m.bias.data, 0)
class Generator(nn.Module):
def __init__(self):
super(Generator, self).__init__()
self.main = nn.Sequential(
nn.ConvTranspose2d(in_channels=nz,
out_channels=ngf*8,
kernel_size=4,
stride=1,
padding=0,
bias=False),
nn.BatchNorm2d(num_features=ngf*8),
nn.ReLU(inplace=True),
nn.ConvTranspose2d(in_channels=ngf*8,
out_channels=ngf*4,
kernel_size=4,
stride=2,
padding=1,
bias=False),
nn.BatchNorm2d(num_features=ngf*4),
nn.ReLU(inplace=True),
nn.ConvTranspose2d(in_channels=ngf*4,
out_channels=ngf*2,
kernel_size=4,
stride=2,
padding=1,
bias=False),
nn.BatchNorm2d(num_features=ngf*2),
nn.ReLU(inplace=True),
nn.ConvTranspose2d(in_channels=ngf*2,
out_channels=ngf,
kernel_size=4,
stride=2,
padding=1,
bias=False),
nn.BatchNorm2d(num_features=ngf),
nn.ReLU(inplace=True),
nn.ConvTranspose2d(in_channels=ngf,
out_channels=nc,
kernel_size=4,
stride=2,
padding=1,
bias=False),
nn.Tanh()
)
def forward(self, x):
x = self.main(x)
return x
class Discriminator(nn.Module):
def __init__(self):
super(Discriminator, self).__init__()
self.main = nn.Sequential(
nn.Conv2d(in_channels=nc,
out_channels=ndf,
kernel_size=4,
stride=2,
padding=1,
bias=False),
nn.LeakyReLU(0.2, inplace=True),
nn.Conv2d(in_channels=ndf,
out_channels=ndf*2,
kernel_size=4,
stride=2,
padding=1,
bias=False),
nn.BatchNorm2d(num_features=ndf*2),
nn.LeakyReLU(0.2, inplace=True),
nn.Conv2d(in_channels=ndf*2,
out_channels=ndf*4,
kernel_size=4,
stride=2,
padding=1,
bias=False),
nn.BatchNorm2d(num_features=ndf*4),
nn.LeakyReLU(0.2, inplace=True),
nn.Conv2d(in_channels=ndf*4,
out_channels=ndf*8,
kernel_size=4,
stride=2,
padding=1,
bias=False),
nn.BatchNorm2d(num_features=ndf*8),
nn.LeakyReLU(0.2, inplace=True),
nn.Conv2d(in_channels=ndf*8,
out_channels=1,
kernel_size=4,
stride=2,
padding=0,
bias=False),
nn.Sigmoid()
)
def forward(self, x):
x = self.main(x)
return x
# model
generator = Generator()
discriminator = Discriminator()
# init params in model
generator.apply(weights_init)
discriminator.apply(weights_init)
# loss
criterion = nn.BCELoss()
fixed_noise = torch.rand(64, nz, 1, 1)
real_label = 1
fake_label = 0
# optimizer
optimizerD = Adam(generator.parameters(), lr=lr, betas=(beta1, 0.999))
optimizerG = Adam(discriminator.parameters(), lr=lr, betas=(beta1, 0.999))
# Train
img_list = []
g_losses = []
d_losses = []
iters = 0
num_epochs = 5
for epoch in range(num_epochs):
for i, (x, y) in enumerate(dataloader, 0):
# train Discriminator
# 1. train based on real-image
optimizerD.zero_grad()
real_label = torch.ones(len(x))
real_pred = discriminator(x).view(-1)
# 2. train based on fake-image
fake_label = torch.zeros(len(x))
noise = torch.randn(len(x), nz, 1, 1)
fake_x = generator(noise)
fake_pred = discriminator(fake_x.detach()).view(-1)
fake_loss = criterion(fake_pred, fake_label)
real_loss = criterion(real_pred, real_label)
d_loss = fake_loss + real_loss
fake_loss.backward()
real_loss.backward()
optimizerD.step()
# train Generator
optimizerG.zero_grad()
fake_pred = discriminator(fake_x).view(-1)
g_loss = criterion(fake_pred, real_label)
g_loss.backward()
optimizerG.step()
if i % 50 == 0:
print("[%d|%d] [%d|%d]\tLoss D: %.4f\tLoss G: %.4f" % (epoch+1, num_epochs, i, len(dataloader), d_loss.item(), g_loss.item()))
d_losses.append(fake_loss.item() + real_loss.item())
g_losses.append(g_loss.item())
if (iters % 500 == 0) or ((epoch == num_epochs-1) and (i == len(dataloader)-1)):
with torch.no_grad():
fake = generator(fixed_noise).detach()
img_list.append(vutils.make_grid(fake, padding=2, normalize=True))
iters += 1
위 코드를 돌리면 아래처럼 Loss가 출력됩니다.
[1|5] [0|1272] Loss D: 1.8754 Loss G: 0.6190
[1|5] [50|1272] Loss D: 28.6092 Loss G: 0.0000
[1|5] [100|1272] Loss D: 29.8591 Loss G: 0.0000
[1|5] [150|1272] Loss D: 27.4660 Loss G: 0.0000
[1|5] [200|1272] Loss D: 29.3481 Loss G: 0.0000
[1|5] [250|1272] Loss D: 26.0290 Loss G: 0.0000
[1|5] [300|1272] Loss D: 26.7343 Loss G: 0.0000
[1|5] [350|1272] Loss D: 26.7163 Loss G: 0.0000
[1|5] [400|1272] Loss D: 29.5696 Loss G: 0.0000
[1|5] [450|1272] Loss D: 30.0844 Loss G: 0.0000
[1|5] [500|1272] Loss D: 31.5356 Loss G: 0.0000
[1|5] [550|1272] Loss D: 30.8461 Loss G: 0.0000
[1|5] [600|1272] Loss D: 30.7053 Loss G: 0.0000
[1|5] [650|1272] Loss D: 32.1853 Loss G: 0.0000
[1|5] [700|1272] Loss D: 30.2996 Loss G: 0.0000
[1|5] [750|1272] Loss D: 36.8295 Loss G: 0.0000
[1|5] [800|1272] Loss D: 29.6888 Loss G: 0.0000
[1|5] [850|1272] Loss D: 37.4410 Loss G: 0.0000
[1|5] [900|1272] Loss D: 34.1657 Loss G: 0.0000
[1|5] [950|1272] Loss D: 39.4706 Loss G: 0.0000
[1|5] [1000|1272] Loss D: 32.2947 Loss G: 0.0000