Torch - Neural Networks 初步

发布时间：2020-12-14 21:52:35 所属栏目：大数据来源：网络整理

导读：Torch7 Neural Networks 初步原文：Torch7. Hello World,Neural Networks! Torch7 学习： Lua 学习 - 基本的语法，tabels，scope，functions Torch 学习 - th interpreter，torch package，torch Tensor 数据结构，optim package Torch 进阶 - nn package，

Torch7 Neural Networks 初步

原文：Torch7. Hello World,Neural Networks!

Torch7 学习：

Lua 学习 - 基本的语法，tabels，scope，functions

Torch 学习 - th interpreter，torch package，torch Tensor 数据结构，optim package

Torch 进阶 - nn package，dp package，nngraph package

1. Lua 简介

Data Structures 数据结构

只有两种数据结构： doubles 和 tables.

Tables 可以存储多种类型数据，也可以作为 dict 和 list.

-- doubles
var = 10 -- 默认为全局变量
local var2 = 10 -- 局部变量

-- tables
dict = {a=1,b=2,c=3} -- dict
list = {1,2,3} -- list

-- 打印
print(dict.a)
print(list[1]) -- list 从 1 开始

Control flow statements 控制流

for i=1,10 do
if i == 1 then
  print("one")
elseif i == 2 then
  print("two")
else
  print("something else")
end
end

val = 1
while val < 10 do
val = val * 2
end

Functions 函数

function add_23(n)
return n + 23
end

print(add_23(7)) -- prints 30

也可以基于 tables 定义 Functions：

tab = {1,3,4}
function tab.sum ()
c = 0
for i=1,#tab do
  c = c + tab[i]
end
return c
end

print(tab:sum()) -- displays 8 (冒号用于函数调用)

Input/Output

文件 I/O：

file = io.open("test.txt","w")
for line in io.lines("~/input.txt") do
file:write(line + "n") -- write on file
end

I/O on stdin and stdout：

input_val = io.read()
io.write("You said: " .. input_val + "n") 
-- alternatively: print ("You said: " .. input_val)

2. Torch

Torch 安装

git clone https://github.com/torch/distro.git ~/torch --recursive
cd ~/torch
./install.sh
source ~/.bashrc # source ~/.profile for Macs

Torch 测试

输入
```
th
```
可以进入 Torch 解释器，采用 Ctrl+C 退出.
Lua/Torch 脚本运行
```
th test_script.lua
```
Torch package 使用
```
require 'torch'
```

torch.Tensor

Torch 使用的是 torch.Tensor 数据结构.

tensor 是 scalars 和 vectors 的泛化.

Torch 的数据都是 torch.Tensor 形式：

a = torch.Tensor(1) -- scalar 标量
b = torch.Tensor(2,1) -- vector 向量 (2 rows,1 column)
c = torch.Tensor(2,2) -- 2D vector (2 rows,2 columns)
-- a,b,c 里存储的数据是无用的值

a[1] = 10 -- now 'a' is a tensor containing an actual value,i.e. 10.
d = torch.rand(2,2) -- this is a 2D vector (2x2) with random values.
e = d:t() -- 'e' is the transpose of 'd' d 的转置

3. Torch - nn package

nn 是神经网络包. 基于 NN 的网络训练和测试过程如下：

Model - 定义了 feedforward 网络，conv 网络，recurrent 网络，recursive 网络；网络层数；每层网络的 units 数；每层网络的激活函数；是否使用 dropouts 以避免 overfitting；
Training - 包括两部分：优化算法(如，SGD) 和优化 loss 函数(Torch 里称 Criterion);
Data - 网络的数据部分，最重要的
Prediction and Evaluation - 采用训练好的模型进行 prediction，并在 test 数据集测试模型效果.

由于网络参数调整是重要的一部分，这里采用命令行选项：

cmd = torch.CmdLine()
cmd:text()
cmd:text('Options for my NN')
cmd:option('-units',10,'units in the hidden layer')
cmd:option('-learningRate',0.1,'learning rate')
-- etc...
cmd:text()
opt = cmd:parse(arg)

3.1 Model

Model 定义了不同的 containers，决定了数据输入网络，和网络计算输出数据的方式.

例如，将 2D torch Tensor 的每一列输入到网络( Container nn.Parallel )；或者，将相同数据输入到不同的网络层，然后再连接其输出( Container nn.Concat)；或者，最简单的全连接 feed-forward 网络(Container nn.Sequential).

网络定义开始前，需要先设定如下：

require 'nn'
mlp = nn.Sequential()

假设，网络有 2 层 hidden feedward 网络层. feedforward 层的计算采用 nn.Linear，实现对输入数据的线性变换.

神经网络对于输入数据的 non-linear 表示具有较好的学习能力，因此，网络添加一些 non-linearity 层，如，nn.Tanh，nn.Sigmoid，nn.ReLU等. 这里以 tanh 作为变换函数.

假设，10-dim 输入，每一个 hidden 层 10 个units：

inputSize = 10
hiddenLayer1Size = opt.units
hiddenLayer2Size = opt.units

mlp:add(nn.Linear(inputSize,hiddenLayer1Size))
mlp:add(nn.Tanh())
mlp:add(nn.Linear(hiddenLayer1Size,hiddenLayer2Size))
mlp:add(nn.Tanh())

添加输出层.

假设，二类的分类任务，采用 Softmax 变换函数(实际上是，log of Softmax):

nclasses = 2

mlp:add(nn.Linear(hiddenLayer2Size,nclasses))
mlp:add(nn.LogSoftMax())

打印模型，

print mlp

结果如下：

nn.Sequential {
  [input -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> output]
  (1): nn.Linear(10 -> 10)
  (2): nn.Tanh
  (3): nn.Linear(10 -> 10)
  (4): nn.Tanh
  (5): nn.Linear(10 -> 2)
  (6): nn.LogSoftMax
}

其图形化表示：

当定义好 NN 之后，基于 Module:forward 模块可以先测试下，确保网络 forward pass 正常：

out = mlp:forward(torch.randn(1,10))
print(out)

其输出是 1×2 tensor，且 out[1][i] 是输入数据属于 class i 的 log 概率值.

3.2 Torch - Trainging

网络训练算法，可以自行设定，即，输入数据处理，送入网络，计算梯度，更新网络参数.

优化算法：

采用 SGD 实现 nn.StochasticGradient ；

需要设定几个参数，如 learning_rate
Loss 函数(Criterion)：

nn package 有很多 loss 函数，这里采用 negative log-likelihood criterion.

训练参数在前面说的命令行里设定.

criterion = nn.ClassNLLCriterion() 
trainer = nn.StochasticGradient(mlp,criterion)
trainer.learningRate = opt.learningRate

3.3 Torch - Data

NN 网络训练开始前，需要数据.

对于数据集的一个要求是，其数据结构应该是 Lua table 的形式，采用 :size() 可以得到 table 内的元素数. table 内的每个元素应该是包含两个子元素的 subtable：

输入( 1×input_size 的 Tensor) ；
target class ( 1×1 的 Tensor).

假设数据集保存在 CSV 格式(或其它相似格式)文件中，包括其 target class.

由于 Lua 没有内置读取 CSV 文件的函数，故自定义读取函数如下：

function string:splitAtCommas()
  local sep,values = ",",{}
  local pattern = string.format("([^%s]+)",sep)
  self:gsub(pattern,function(c) values[#values+1] = c end)
  return values
end

function loadData(dataFile)
  local dataset = {}
  for line in io.lines(dataFile) do
    local values = line:splitAtCommas()
    local y = torch.Tensor(1)
    y[1] = values[#values] -- the target class is the last number in the line
    values[#values] = nil
    local x = torch.Tensor(values) -- the input data is all the other numbers
    dataset[i] = {x,y}
    i = i + 1
  end
  function dataset:size() return (i - 1) end -- the requirement mentioned
  return dataset
end

dataset = loadData("trainfile.csv")

数据读取后，采用 nn.StochasticGradient:train() 即可开始网络训练：

trainer:train(dataset)

3.4 Torch - Prediction and Evaluation

网络训练结束后，对于测试数据即可进行分类：

x = torch.randn(10)
y = mlp:forward(x)
print(y) -- returns the log probability of each class

基于测试数据集，可以计算训练模型的精度，评估其效果：

function argmax(v) -- Lua 中没有该函数，自定义
  local maxvalue = torch.max(v)
  for i=1,v:size(1) do
    if v[i] == maxvalue then
      return i
    end
  end
end


tot = 0
pos = 0
for line in io.lines("testfile.csv") do
  values = line:splitAtCommas()
  local y = torch.Tensor(1)
  y[1] = values[#values]
  values[#values] = nil
  local x = torch.Tensor(values)
  local prediction = argmax(mlp:forward(x))
  if math.floor(prediction) == math.floor(y[1]) then
    pos = pos + 1
  end
  tot = tot + 1
end
print("Accuracy(%) is " .. pos/tot*100)

也可以将结果保存到磁盘，使用时再加载：

print("Weights of saved model: ")
print(mlp:get(1).weight) 
-- mlp:get(1) is the first module of mlp,i.e. nn.Linear(10 -> 10)
-- mlp:get(1).weight is the weight matrix of that layer
torch.save('file.th',mlp)
mlp2 = torch.load('file.th')
print("Weights of saved model:")
print(mlp2:get(1).weight) -- this will print the exact same matrix

4. dp package

dp 是深度学习库，扩展了 nn 很多有用的特点.

dp 库包括了计算机视觉的通用数据集，NLP；并提供了更优雅的数据集创建和加载方式.

dp 库使网络技巧更容易使用，比如 learning rate decaying、Early Stopping on a development set as well as reporting accuracies and confusion matrices on training,development and testing sets. 可以参考例子Demo.

4.1 Word embeddings

NLP 领域，word embedding 可以创建 words 的分布表示，以便于分析语法和语义信息.

word2vec 和 GloVe 两种工具能够用于大规模语言建模语料库，生成 word embeddings，直接用于神经网络来表示 words.

例如，假设输入是 words 序列，则可以将每个 word 替换为其预训练的 embedding vector，然后将所有的向量连接，即生成数值化的输入层. 对此，给定输入 words 序列，需要有生成其向量的脚本，可以采用 Python脚本.

另一种方式是，基于 NN 训练 embeddings. 这里，

输入层由索引序列(sequence of indices) 组成，sequence of indices 拜师每个 word 在 predefined vocabulary 中的 index.
然后，第二层是 embedding 或 lookup table，用于提取对应于 index 的 embedding vector. 其中 lookup table 是一个加权矩阵，其训练类似于其它 NN 层的权重矩阵，第 i 行包含了 vocabulary 中第 i 个 word 的 embedding.

如果随机初始化 lookup table 加权矩阵，开始时的 embeddings 是无用的；当网络开始训练时，BP 将开始更新该加权矩阵，最终能够生成包含语法和语义信息的 embeddings.

Torch：

require 'nn';
require 'dp'; --necessary for nn.Collapse

vocabularySize = 10000
embeddingSize = 100 -- a commmon choice for word embeddings

model = nn.Sequential()
model:add(nn.LookupTable(vocabularySize,embeddingSize))
model:add(nn.Collapse(2)) -- to concatenate the embeddings of all input words
-- then you can add the rest of your network..

预先采用 word2vec 或者 GloVe 初始化 word embeddings 是一种有效的 embedding 层的初始化方式. 可以明显的减少训练 epoches 的数量.

Torch：

model = nn.Sequential()
emb = nn.LookupTable(vocabularySize,embeddingSize))
i = 1
for line in io.lines("pretrained.txt") do
  vals = line:splitAtCommas()
  emb.weight[i] = torch.Tensor(vals) -- set the pretrained values in the matrix
  i = i + 1
end
model:add(emb)
model:add(nn.Collapse(2))

5. nngraph package

nngraph 扩展了 nn，以便于定义任何类型的 DAG(Directed Acyclic Graph) 神经网络结构. 往往用于有多种类型的输入或输出的网络定义，如 LSTM. nn 提供了相应的模块，但是，以 graph 方式，比以 sequential 方式，更加方便和灵活.

基于 nngraph 重新定义上面的 feed-forward NN 网络：

首先，定义输入(one or more)：

require 'nngraph'

inputs = {}
table.insert(inputs,nn.Identity()())
input = inputs[1]

然后，构建网络：

lin1 = nn.Linear(inputSize,hiddenLayer1Size)(input)
act1 = nn.Tanh()(lin1)
lin2 = nn.Linear(hiddenLayer1Size,hiddenLayer2Size)(act1)
act2 = nn.Tanh()(lin2)

out = nn.Linear(hiddenLayer2Size,nclasses)(act2)
softmax = nn.LogSoftMax()(out)

最后，定义输出(one or more)，并构建 nn.GModule object，该 object 可以用作传统的 nn Container：
```
outputs = {}
table.insert(outputs,softmax)

mlp = nn.gModule(inputs,outputs)
```

[1] - Learn Lua in 15 Minutes

[2] - Getting started with Torch

（编辑：李大同）

【声明】本站内容均来自网络，其相关言论仅代表作者个人观点，不代表本站立场。若无意侵犯到您的权利，请及时与联系站长删除相关内容!