首页 > 营销学院 > IT资讯

INADE个人讲解和理解

INADE基于SPADE改进，是一种条件归一化方法，旨在提升语义图像生成多样性。它结合语义分割（提供普遍性）与实例分割（提供特殊性），通过统一噪声采样协调各归一化层，避免不一致。编码器生成相关噪声辅助训练，解决了现有方法中同语义实例风格趋同的问题，实现语义级和实例级多样性。

☞☞☞AI 智能聊天, 问答助手, AI 智能搜索, 免费无限量使用 DeepSeek R1 模型☜☜☜

inade个人讲解和理解 -

INADE讲解（主要偏向于个人理解及其看法）

1. 论文题目：Diverse Semantic Image Synthesis via Probability Distribution Modeling

2. 原项目地址：https://github.com/tzt101/INADE

首先这篇论文也是主要基于一个SPADE的论文，这里还是强推FutureSI项目的spade讲解，但是呢这个SPADE被它改成了INADE，同样是一个conditional normal，之所以这么改，原作者目的是提高一个图像生成的多样性。后面会具体讲解

这里描述一下它对于语义生成任务的看法：

条件归一化，无论是空间自适应[30]还是类自适应[37]，都已被证明有助于语义图像的合成。语义条件调制能够在很大程度上防止重复规范化引起的语义信息的“冲刷”效应。

这里就是说SPADE很好用

然而，考虑到标准化仅以语义图为条件，并且仅使用全局随机性（global randomness）来多样化图像样式[30]，因此要实现具有语义级甚至实例级多样性的有希望的生成结果仍然存在挑战。

这里认为对于模型只输入一个语义分割信息，明显不够，靠只输入这玩意要求模型有太好的效果多样性，几乎不可能

语义级的多样性是由[51]通过Group conv 来实现的，但是使用这种卷积通过实例映射切断了将其扩展到实例级多样性的可能性。

这里说明GroupNet是不行的，这里这个改进方案GroupDNet我在我的项目论文解读一篇关于语义生成论文（要求控制单独语义生成）提到，这是另一篇论文啊。

最近在实例感知合成[41,6]方面的研究主要集中在更好的对象边界上，而不是每个单独实例的多样性和真实性。由于缺乏适当的实例条件作用，现有的方法倾向于将具有相同语义标签的实例收敛成相似的风格，这严重损害了生成的多样性。

与第二点有点呼应的味道

这个时候它讲了一下这个INADE添加实例分割的想法来源：

实例级多样性的关键是确定地决定特定语义标签的一般特征的统一语义级分布和引入语义分布模型所允许覆盖的多样性的实例级随机性的适当组合。

这里我认为就是普遍性与特殊性的合理结合，语义分割提供普遍性，实例分割提供特殊性。就比如两只比翼双飞的鸟，都是鸟（普遍性），但是2个不同个体（特殊性）。如果没有实例分割输入，那就退化输入语义分割。

考虑到生成网络包含多个条件归一化层，一个统一的采样解决方案仍然是协调所有这些层的关键。一种直接的方法，即对每个归一化层进行独立的随机抽样，可能会引入不一致性，并导致多样性被严重中和。因此，在本文中，我们提出了一种实例自适应调制采样方法，该方法可以在channel不相等的多个归一化层上实现一致的实例采样。

说白了就是在INADE输入一个贯穿deocder的noise，强调了一下这个noise这个融合的idea很牛逼，需要吹一下
Cursor
一个新的IDE，使用AI来帮助您重构、理解、调试和编写代码。
847 查看详情

生成器架构

INADE个人讲解和理解 -

首先这个INADE它为了提高这个多样性主要想法是什么呢，它比起SPADE多了一个实例分割的输入，见这张图。 INADE个人讲解和理解 -

m就是语义分割输入，p是实例分割输入，这应该是一张照片上的两个鸟，语义分割把它们用同一标识，但是实例分割进行了区分。

INADE数学公式表示

INADE个人讲解和理解 -

看不懂没关系，直接看代码就行,毕竟论文会讲故事，代码不会讲故事

这是INADE，实际pytorch代码：

class ILADE(nn.Module):
    def __init__(self, config_text, norm_nc, label_nc, noise_nc):
        super().__init__()
        self.norm_nc = norm_nc        assert config_text.startswith('spade')
        parsed = re.search('spade(\D+)(\d)x\d', config_text)
        param_free_norm_type = str(parsed.group(1))        if param_free_norm_type == 'instance':
            self.param_free_norm = nn.InstanceNorm2d(norm_nc, affine=False)        elif param_free_norm_type == 'syncbatch':
            self.param_free_norm = SynchronizedBatchNorm2d(norm_nc, affine=False)        elif param_free_norm_type == 'batch':
            self.param_free_norm = nn.BatchNorm2d(norm_nc, affine=False)        else:            raise ValueError('%s is not a recognized param-free norm type in SPADE'
                             % param_free_norm_type)        # wights and bias for each class
        self.weight = nn.Parameter(torch.Tensor(label_nc, norm_nc,2))
        self.bias = nn.Parameter(torch.Tensor(label_nc, norm_nc,2))
        self.reset_parameters()
        self.fc_noise = nn.Linear(noise_nc, norm_nc)    def reset_parameters(self):
        nn.init.uniform_(self.weight)
        nn.init.zeros_(self.bias)    def forward(self, x, segmap, input_instances=None, noise=None):
        # Part 1. generate parameter-free normalized activations
        # noise is [B, inst_nc, 2, noise_nc], 2 is for scale and bias
        normalized = self.param_free_norm(x)        # Part 2. scale the segmentation mask and instance mask
        segmap = F.interpolate(segmap, size=x.size()[2:], mode='nearest')
        input_instances = F.interpolate(input_instances, size=x.size()[2:], mode='nearest')        # the segmap is concate with instance map
        inst_map = torch.unsqueeze(segmap[:,-1,:,:],1)
        segmap = segmap[:,:-1,:,:]        # Part 3. class affine with noise
        noise_size = noise.size() # [B,inst_nc,2,noise_nc]
        noise_reshape = noise.view(-1, noise_size[-1]) # reshape to [B*inst_nc*2,noise_nc]
        noise_fc = self.fc_noise(noise_reshape) # [B*inst_nc*2, norm_nc]
        noise_fc = noise_fc.view(noise_size[0],noise_size[1],noise_size[2],-1)        # create weigthed instance noise for scale
        class_weight = torch.einsum('ic,nihw->nchw', self.weight[...,0], segmap)
        class_bias = torch.einsum('ic,nihw->nchw', self.bias[...,0], segmap)        # init_noise = torch.randn([x.size()[0], input_instances.size()[1], self.norm_nc], device=x.get_device())
        instance_noise = torch.einsum('nic,nihw->nchw', noise_fc[:,:,0,:], input_instances)
        scale_instance_noise = class_weight*instance_noise+class_bias        # create weighted instance noise for bias
        class_weight = torch.einsum('ic,nihw->nchw', self.weight[..., 1], segmap)
        class_bias = torch.einsum('ic,nihw->nchw', self.bias[..., 1], segmap)        # init_noise = torch.randn([x.size()[0], input_instances.size()[1], self.norm_nc], device=x.get_device())
        instance_noise = torch.einsum('nic,nihw->nchw', noise_fc[:,:,1,:], input_instances)
        bias_instance_noise = class_weight * instance_noise + class_bias

        out = scale_instance_noise * normalized + bias_instance_noise        return out

下面是我写的paddle版本

In [1]

import paddleimport paddle.nn as nnimport paddle.nn.functional as F'''
在这里有一个einsum相信大家也不用一般，至少我不用啊，哈哈
在这里的用法，我举例一下
instance_noise = paddle.einsum('nic,nihw->nchw', noise_fc[:,:,0,:], input_instances)#[B,instance_nc,norm_nc] [B,instance_nc,h,w] ->[B,norm_nc,h,w]
 noise_fc[:,:,0,:]，input_instances这两个tensor.shape分别为[B,instance_nc,norm_nc] [B,instance_nc,h,w] 
 经过了上述的这里einsum操作，就得到shape为[B,norm_nc,h,w]的tensor，那这里很明显就是相当于在nn.linear放在第1维进行的那种感觉无bias，矩阵乘法
'''class INADE(nn.Layer):
    def __init__(self, norm_nc = 64, label_nc = 46, noise_nc = 108):
        super().__init__()

        self.param_free_norm = nn.InstanceNorm2D(norm_nc,weight_attr=False, bias_attr=False)        # wights and bias for each class
        weight = self.create_parameter([label_nc,norm_nc,2], default_initializer = paddle.nn.initializer.Uniform())#随机均匀分布初始化函数
        self.add_parameter("weight", weight)

        bias = self.create_parameter([label_nc,norm_nc,2],default_initializer = paddle.nn.initializer.Constant())
        self.add_parameter("bias", bias)
        self.fc_noise = nn.Linear(noise_nc, norm_nc)    def forward(self, x, segmap, input_instances=None, noise=None):
        # Part 1. generate parameter-free normalized activations
        # noise is [B, inst_nc, 2, noise_nc], 2 is for scale and bias
        normalized = self.param_free_norm(x)        # Part 2. scale the segmentation mask and instance mask
        segmap = F.interpolate(segmap, size=x.shape[2:], mode='nearest')
        input_instances = F.interpolate(input_instances, size=x.shape[2:], mode='nearest')        # the segmap is concate with instance map
        inst_map = paddle.unsqueeze(segmap[:,-1,:,:],1)# 后面就不用了
        segmap = segmap[:,:-1,:,:]        # Part 3. class affine with noise
        noise_size = noise.shape # [B,inst_nc,2,noise_nc]
        noise_reshape = noise.reshape([-1, noise_size[-1]]) # reshape to [B*inst_nc*2,noise_nc]
        noise_fc = self.fc_noise(noise_reshape) # [B*inst_nc*2, norm_nc]
        noise_fc = noise_fc.reshape([noise_size[0],noise_size[1],noise_size[2],-1])#[B,instance_nc,2,norm_nc]
        print("noise_fc",noise_fc.shape)        # create weigthed instance noise for scale
        class_weight = paddle.einsum('ic,nihw->nchw', self.weight[...,0], segmap)#[label_nc, norm_nc] [b,label_nc,h,w] ->#[B,norm_nc,h,w]
        print("class_weight",class_weight.shape)
        class_bias = paddle.einsum('ic,nihw->nchw', self.bias[...,0], segmap)#[label_nc, norm_nc] [b,label_nc,h,w] ->#[B,norm_nc,h,w]
        # init_noise = torch.randn([x.size()[0], input_instances.size()[1], self.norm_nc], device=x.get_device())
        instance_noise = paddle.einsum('nic,nihw->nchw', noise_fc[:,:,0,:], input_instances)#[B,instance_nc,norm_nc] [B,instance_nc,h,w] ->[B,norm_nc,h,w]
        scale_instance_noise = class_weight*instance_noise+class_bias        # create weighted instance noise for bias
        class_weight = paddle.einsum('ic,nihw->nchw', self.weight[..., 1], segmap)
        class_bias = paddle.einsum('ic,nihw->nchw', self.bias[..., 1], segmap)        # init_noise = torch.randn([x.size()[0], input_instances.size()[1], self.norm_nc], device=x.get_device())
        instance_noise = paddle.einsum('nic,nihw->nchw', noise_fc[:,:,1,:], input_instances)#[B,instance_nc,norm_nc]  #[B,instance_nc,h,w]
        bias_instance_noise = class_weight * instance_noise + class_bias

        out = scale_instance_noise * normalized + bias_instance_noise        return out
x = paddle.randn([3,64,50,50])
segmap = paddle.randn([3,47,66,66])
inst = paddle.randn([3,72,50,50])
noise = paddle.randn([3,72,2,108])
INADE()(x,segmap,inst,noise).shape

W0222 11:03:38.883304   183 device_context.cc:447] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 10.1, Runtime API Version: 10.1
W0222 11:03:38.890153   183 device_context.cc:465] device: 0, cuDNN Version: 7.6.

noise_fc [3, 72, 2, 64]
class_weight [3, 64, 50, 50]

[3, 64, 50, 50]

另外这个noise上面也说了，整个generator都是使用同一个noise.

这里还有一个细节就是，这个noise要和这个输入的图片信息有关系，相当于这个z是包含信息的。这样方便训练，嗯。这里代码实现很复杂，实在有兴趣自己看原项目，因为这里我不能用到我的项目上我就没有太关心。加油.

开个玩笑啊，其实这里这个noise的设计具体落实到代码里面其实是很重要的，因为其中第一点是可以通过noise来进行模型生成多样性的增加和控制，然后另外就是这个noise开始训练的时候必须要和原图信息有关联，这样好训练，不然如果这个Noise直接初始化就很难训练。

接下来我就直接从代码的角度去分析实际pytorch代码中的主体实践。

一些用到的API介绍

nn.Unfold

INADE个人讲解和理解 -

import paddleimport paddle.nn as nn

x = paddle.randn((100,3,224,224)) 
unfold = nn.Unfold(kernel_sizes=[3, 3])
result = unfold(x) #result.shape = [100,3*3*3,(224-3+1)*(224-3+1)]print(result.shape)

paddle.clip

INADE个人讲解和理解 -

In [1]

'''
该代码块是重新构造一个encoder里面用到的卷积封装，这个encoder就是为了训练时候构造这个noise用的
'''import paddleimport paddle.nn.functional as Fimport paddle.nn as nnclass InstanceAwareConv2d(nn.Layer):
    def __init__(self, fin = 64, fout = 128, kw = 3, stride=1, padding=1):
        super().__init__()
        self.kw = kw
        self.stride = stride
        self.padding = padding
        self.fin = fin
        self.fout = fout
        self.unfold = nn.Unfold(kw, strides = stride, paddings = padding)

        weight = self.create_parameter([fout, fin, kw, kw], default_initializer = paddle.nn.initializer.Uniform())#随机均匀分布初始化函数
        self.add_parameter("weight", weight)

        bias = self.create_parameter([fout],default_initializer = paddle.nn.initializer.Constant())
        self.add_parameter("bias", bias)    def forward(self, x, instances, check=False):
        N,C,H,W = x.shape        # cal the binary mask from instance map
        instances = F.interpolate(instances, x.shape[2:], mode='nearest') # [n,1,h,w]
        inst_unf = self.unfold(instances)        # print("inst_unf",inst_unf.shape)
        # substract the center pixel
        center = paddle.unsqueeze(inst_unf[:, self.kw * self.kw // 2, :], axis=1)#因为instance的channel为1,所以这个channel的center 为 self.kw * self.kw // 2
        # print("center",center.shape)
        mask_unf = inst_unf - center        # clip the absolute value to 0~1
        mask_unf = paddle.abs(mask_unf)
        mask_unf = paddle.clip(mask_unf, 0, 1)
        mask_unf = 1.0 - mask_unf # [n,k*k,L]
        # print("mask_unf",mask_unf.shape)#mask_unf [4, 9, 65536]
        # # multiply mask_unf and x
        x_unf = self.unfold(x)  # [n,c*k*k,L]
        # print("x_unf",x_unf.shape) #x_unf [4, 64*9, 65536]
        x_unf = x_unf.reshape([N, C, -1, x_unf.shape[-1]]) # [n,c,k*k,L]
        # print("x_unf",x_unf.shape) #[4, 64, 9, 65536]
        mask = paddle.unsqueeze(mask_unf,1) # [n,1,k*k,L]
        mask_x = mask * x_unf # [n,c,k*k,L]
        mask_x = mask_x.reshape([N,-1,mask_x.shape[-1]]) # [n,c*k*k,L]
        # # conv operation
        weight = self.weight.reshape([self.fout,-1]) # [fout, c*k*k]
        out = paddle.einsum('cm,nml->ncl', weight, mask_x)        # print("out",out.shape)#[4, 128, 65536]
        # # x_unf = torch.unsqueeze(x_unf, 1)  # [n,1,c*k*k,L]
        # # out = torch.mul(masked_weight, x_unf).sum(dim=2, keepdim=False) # [n,fout,L]
        bias = paddle.unsqueeze(paddle.unsqueeze(self.bias,0),-1) # [1,fout,1]
        out = out + bias        # print("out",out.shape)#[4, 128, 65536]
        out = out.reshape([N,self.fout,H//self.stride,W//self.stride])        # # print('weight:',self.weight[0,0,...])
        # # print('bias:',self.bias)

        if check:
            out2 = nn.functional.conv2d(x, self.weight, self.bias, stride=self.stride, padding=self.padding)            print((out-out2).abs().max())        return out

x = paddle.randn([4,64,256,256])
y = paddle.randn([4,1,256,256])
InstanceAwareConv2d()(x,y).shape

W0222 16:13:26.634137   145 device_context.cc:447] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 10.1, Runtime API Version: 10.1
W0222 16:13:26.638576   145 device_context.cc:465] device: 0, cuDNN Version: 7.6.

[4, 128, 256, 256]

In [ ]

# Encoder构造import paddle 
import paddle.nn as nnimport numpy as npclass Encoder_OPT:
    def __init__(self):
        super().__init__()
        self.ngf = 64
        self.semantic_nc = 46
        self.no_instance = True
        self.noise_nc = 108opt = Encoder_OPT()class instanceAdaptiveEncoder(nn.Layer):
    def __init__(self, opt):
        super().__init__()
        self.opt = opt
        kw = 3
        pw = int(np.ceil((kw - 1.0) / 2))
        ndf = opt.ngf
        conv_layer = InstanceAwareConv2d

        self.layer1 = conv_layer(3, ndf, kw, stride=2, padding=pw)
        self.norm1 = nn.InstanceNorm2D(ndf)
        self.layer2 = conv_layer(ndf * 1, ndf * 2, kw, stride=2, padding=pw)
        self.norm2 = nn.InstanceNorm2D(ndf * 2)
        self.layer3 = conv_layer(ndf * 2, ndf * 4, kw, stride=2, padding=pw)
        self.norm3 = nn.InstanceNorm2D(ndf * 4)
        self.layer4 = conv_layer(ndf * 4, ndf * 8, kw, stride=2, padding=pw)
        self.norm4 = nn.InstanceNorm2D(ndf * 8)
        
        self.middle = conv_layer(ndf * 8, ndf * 4, kw, stride=1, padding=pw)
        self.norm_middle = nn.InstanceNorm2D(ndf * 4)
        self.up1 = conv_layer(ndf * 8, ndf * 2, kw, stride=1, padding=pw)
        self.norm_up1 = nn.InstanceNorm2D(ndf * 2)
        self.up2 = conv_layer(ndf * 4, ndf * 1, kw, stride=1, padding=pw)
        self.norm_up2 = nn.InstanceNorm2D(ndf)
        self.up3 = conv_layer(ndf * 2, ndf, kw, stride=1, padding=pw)
        self.norm_up3 = nn.InstanceNorm2D(ndf)

        self.up = nn.Upsample(scale_factor=2, mode='bilinear')
        self.class_nc = opt.semantic_nc if opt.no_instance else opt.semantic_nc-1

        self.scale_conv_mu = conv_layer(ndf, opt.noise_nc, kw, stride=1, padding=pw)
        self.scale_conv_var = conv_layer(ndf, opt.noise_nc, kw, stride=1, padding=pw)
        self.bias_conv_mu = conv_layer(ndf, opt.noise_nc, kw, stride=1, padding=pw)
        self.bias_conv_var = conv_layer(ndf, opt.noise_nc, kw, stride=1, padding=pw)

        self.actvn = nn.LeakyReLU(0.2, False)
        self.opt = opt    def instAvgPooling(self, x, instances):
        inst_num = instances.shape[1]        for i in range(inst_num):
            inst_mask = paddle.unsqueeze(instances[:,i,:,:], 1) # [n,1,h,w]
            pixel_num = paddle.sum(paddle.sum(inst_mask, axis=2, keepdim = True), axis=3, keepdim=True)
            pixel_num[pixel_num == 0] = 1 #防止某一个instance的label 某一行或某一列没有，防止后续步骤中作为除数报错
            feat = x * inst_mask#只需要该label的x信息，inst_mask为0或1.
            feat = paddle.sum(paddle.sum(feat, axis =2, keepdim=True), axis=3, keepdim=True) / pixel_num            if i == 0:
                out = paddle.unsqueeze(feat[:,:,0,0],1) # [n,1,c]
            else:
                out = paddle.concat([out,paddle.unsqueeze(feat[:,:,0,0],1)],1)            # inst_pool_feats.append(feat[:,:,0,0]) # [n, 64]
        return out #shape = [n,inst_num,c]

    def forward(self, x, input_instances):
        # instances [n,1,h,w], input_instances [n,inst_nc,h,w] 注意一下这个shape
        instances = paddle.argmax(input_instances, 1, keepdim=True).astype("float32")        print("instance",instances.shape)

        x1 = self.actvn(self.norm1(self.layer1(x,instances)))
        x2 = self.actvn(self.norm2(self.layer2(x1,instances)))
        x3 = self.actvn(self.norm3(self.layer3(x2,instances)))
        x4 = self.actvn(self.norm4(self.layer4(x3,instances)))        print("x1",x1.shape,"x2",x2.shape,"x3",x3.shape,"x4",x4.shape) #x1 [4, 64, 128, 128] x2 [4, 128, 64, 64] x3 [4, 256, 32, 32] x4 [4, 512, 16, 16]
        y = self.up(self.actvn(self.norm_middle(self.middle(x4,instances))))
        y1 = self.up(self.actvn(self.norm_up1(self.up1(paddle.concat([y,x3],1),instances))))
        y2 = self.up(self.actvn(self.norm_up2(self.up2(paddle.concat([y1, x2], 1),instances))))
        y3 = self.up(self.actvn(self.norm_up3(self.up3(paddle.concat([y2, x1], 1),instances))))        print("y",y.shape,"y1",y1.shape,"y2",y2.shape,"y3",y3.shape)# y [4, 256, 32, 32] y1 [4, 128, 64, 64] y2 [4, 64, 128, 128] y3 [4, 64, 256, 256]
        scale_mu = self.scale_conv_mu(y3,instances)
        scale_var = self.scale_conv_var(y3,instances)
        bias_mu = self.bias_conv_mu(y3,instances)
        bias_var = self.bias_conv_var(y3,instances)

        scale_mus = self.instAvgPooling(scale_mu,input_instances)
        scale_vars = self.instAvgPooling(scale_var,input_instances)
        bias_mus = self.instAvgPooling(bias_mu,input_instances)
        bias_vars = self.instAvgPooling(bias_var,input_instances)        return scale_mus, scale_vars, bias_mus, bias_vars #shape都为[batch_size,instance_nc,noise_num]encoder = instanceAdaptiveEncoder(opt)
x = paddle.randn([4,3,256,256])
input_instances = paddle.randn([4,72,256,256])
encoder(x,input_instances)

In [5]

class Encoder_OPT:
    def __init__(self):
        super().__init__()
        self.ngf = 64
        self.semantic_nc = 2
        self.no_instance = True
        self.noise_nc = 108opt = Encoder_OPT()def instance_encode_z(real_image, input_instances):
    s_mus, s_logvars, b_mus, b_logvars = instanceAdaptiveEncoder(opt)(real_image,input_instances)
    z = [s_mus,paddle.exp(0.5 * s_logvars),b_mus,paddle.exp(0.5 * b_logvars)]    return z, s_mus, s_logvars, b_mus, b_logvars

instance_nc = 2real_image = paddle.randn([4,3,256,256])
input_instances = paddle.randn([4,instance_nc,256,256])
z, s_mus, s_logvars, b_mus, b_logvars = instance_encode_z(real_image,input_instances)#s_mus, s_logvars, b_mus, b_logvars这四个return是为了计算KLDLOSS

instance [4, 1, 256, 256]
x1 [4, 64, 128, 128] x2 [4, 128, 64, 64] x3 [4, 256, 32, 32] x4 [4, 512, 16, 16]
y [4, 256, 32, 32] y1 [4, 128, 64, 64] y2 [4, 64, 128, 128] y3 [4, 64, 256, 256]

In [7]

# KLD_loss = (KLDLoss(s_mus, s_logvars)+KLDLoss(b_mus, b_logvars)) * .opt.lambda_kld / 2instance_nc = 2noise_nc = 108noise = paddle.randn([x.shape[0], instance_nc, 2,noise_nc])def pre_process_noise( noise, z):
    '''
    noise: [n,inst_nc,2,noise_nc], z_i [n,inst_nc,noise_nc]
    z: [s_mus,torch.exp(0.5 * s_logvars),b_mus,torch.exp(0.5 * b_logvars)]
    '''
    s_noise = paddle.unsqueeze(noise[:,:,0,:].multiply(z[1])+z[0],2)
    b_noise = paddle.unsqueeze(noise[:,:,1,:].multiply(z[3])+z[2],2)    return paddle.concat([s_noise,b_noise],2)

noise = pre_process_noise(noise,z)#这个时候得到的noise才是贯穿decoder，其中INADE的一个输入的noiseprint(noise.shape) #[4,instance_nc,2,noise]

[4, 2, 2, 108]

那么train的时候这个noise可以得到了，但是decoder是基于一个很小的特征图逐渐上采样的，那么这个特征图作者采取得到的方法是直接randn加linear再reshape，初始化完全是没有任何信息的。这样的方法我看来是不错的，因为实际测试的时候decoder输入的就是标准正太分布，这样就防止了训练和测试的时候输入不一致的问题，防止模型依靠这个初始特征图的结构信息.（这个特征图的处理我正在实验，好像不太好训练）

batch_size = 4z = paddle.randn(batch_size, z_dim,
                    dtype=torch.float32, device=input.get_device())
x = nn.Linear(opt.z_dim, 16 * 64 * sw * sh)(z)
x = x.reshape(-1, 16 * 64, self.sh, self.sw)

以上就是INADE个人讲解和理解的详细内容，更多请关注其它相关文章！

# 一键 # 威威seo泛域名 # 延安网站建设主题论文 # 网站推广去哪家比较好做 # seo优化助理 # 沙井seo优化电话 # 杭州seo博客牛牛 # 中山seo搜索栏分析 # 湖南营销推广咨询电话 # 宜昌网站建设网站优化 # 崇左网站的推广 # 安装包 # git # 这个时候 # 考虑到 # 自适应 # 多个 # 在这里 # 这是 # 都是 # 中文网 # type # fig # ai

相关栏目：【 Google疑问12 】【 Facebook疑问10 】【优化推广96088 】【技术知识133117 】【 IDC资讯59369 】【网络运营7196 】【 IT资讯61894 】

2025-07-31

Notion AI怎么写笔记 Notion AI辅助写作及自动摘要生成技巧【教学】 AI一键生成高质量论文大纲 Claude帮你改写和润色文章 Claude写作风格优化技巧怎么用ai创作绘本 AI儿童故事与插画自动生成【秘籍】去哪旅行ai抢票助手怎么查看抢票历史_去哪旅行ai抢票助手历史记录查询与筛选【教程】如何用AI一键去除图片背景？AI自动抠图去底最强工具【实测】 DeepSeek数学建模应用指南 DeepSeek解决复杂问题技巧如何用AI一键扩图补全背景？Photoshop AI生成填充使用技巧【教程】 DeepSeek辅助撰写技术文档方法 DeepSeek开发者必备技巧文心一言官方网站在线入口文心一言在线版使用地址 Kimi国内访问入口_Kimi智能助手网页版链接直达如何用AI生成室内设计效果图？AI装修设计灵感生成指南【教程】 AI一键生成社交媒体自动回复蚂蚁阿福官网网页版入口_电脑端使用医保与健康服务如何用AI一键去视频水印 AI视频无痕去水印软件使用方法【教程】 Claude帮你解读晦涩的学术理论 Claude知识学习助手 Jasper AI怎么写社交媒体帖子 Jasper AI社媒内容创作【攻略】 DeepSeek长代码项目理解与分析 DeepSeek代码库学习方法 DeepSeek进行科学计算教程 DeepSeek物理建模与* AI一键生成短视频分镜头脚本

了解您产品搜索量及市场趋势，制定营销计划

同行竞争及网站分析保障您的广告效果

点击免费数据支持

提交您的需求，1小时内享受我们的专业解答。

运城市盐湖区信雨科技有限公司

运城市盐湖区信雨科技有限公司是一家深耕海外推广领域十年的专业服务商，作为谷歌推广与Facebook广告全球合作伙伴，聚焦外贸企业出海痛点，以数字化营销为核心，提供一站式海外营销解决方案。公司凭借十年行业沉淀与平台官方资源加持，打破传统外贸获客壁垒，助力企业高效开拓全球市场，成为中小企业出海的可靠合作伙伴。