Skip to content
GitLab
Projects
Groups
Snippets
Help
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
L
learning-autograd
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Service Desk
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Operations
Operations
Incidents
Environments
Packages & Registries
Packages & Registries
Container Registry
Analytics
Analytics
CI / CD
Repository
Value Stream
Wiki
Wiki
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
Alejandro Riera Mainar
learning-autograd
Commits
9ad8e532
Commit
9ad8e532
authored
Nov 19, 2018
by
Alejandro Riera
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Neural Network Model with 1 Hidden Layer using autograd
parent
91e2b02c
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
168 additions
and
0 deletions
+168
-0
README.md
README.md
+69
-0
nn1hl.py
nn1hl.py
+99
-0
No files found.
README.md
0 → 100644
View file @
9ad8e532
# Notes on how I proceeded:
*
writing unit tests to check that my logic yielding the same results as coursera's implementation
*
when implementing 1 FC layer I had a big headache
*
For fast iteration I fixed the following hyperparameters: hidden_units=10, epochs=2000, learning_rate=0.01
*
Best accuracies in test would range from 66% to 72%
*
my mistake first was initializing the weights and biases to zeros
*
this showed up as having the loss plateau very early, and hence having a very poor
accuracy in both the training and the test set
*
this works on a logistic regression but
*
doesnt work on NN because of what is called the symetry problem. Essentially
all your hidden units will end up calculating the same function (being symetric)
*
my attempt to fix it was to initialize Weights and Biases to random values
`torch.rand`
*
this didnt work either
*
z1, A1, and z2 had values >> 1000
*
A2 = sigmoid(z2) was all ones
*
I was using Log Loss (Cross Entropy Loss) function and is was yielding NaN
*
hypothesis: are the weights too big?
*
initialize randomly but divide by a power of 10, e.g.,
`torch.rand() * 0.01`
*
I tried different values, 0.1, 0.01, 0.001 and 0.0001
*
Smaller than 0.1 would return a sensible value for the loss
*
But, for some reason, the grandients (
`self.w1.grad`
) return None
*
I can't come up with an explanation for this, so I'll move forward and try something else
*
instead of using my own initialization, I would use one of the available ones:
```
self.w1 = torch.zeros((X.shape[0], self.hidden_units), requires_grad=True, dtype=torch.float64)
torch.nn.init.normal_(self.w1, mean=0, std=0.01)
```
*
Default values (mean=0, std=1) had the NaN problem
*
Playing with values I foudn that mean=0, std=0.01 fixed it
*
mean=0, std=0.1 still gave NaN
*
mean=0, std=0.01 yielded the best results
*
other intializations: Xavier Normal and Xavier Uniform
```
torch.nn.init.xavier_normal_(self.w1)
torch.nn.init.xavier_normal_(self.w1, gain=torch.nn.init.calculate_gain('relu'))
torch.nn.init.xavier_uniform_(self.w1)
torch.nn.init.xavier_uniform_(self.w1, gain=torch.nn.init.calculate_gain('relu'))
```
*
both Normal and Uniform showed the same behaviour
*
worked out of the box with default params
*
if gain is not calculated: 66% accuracy test
*
of gain is calculated: 72% accuracy test
*
Best Scores achieved:
*
Heavy loading
train accuracy: 100.0
test accuracy: 76.0
*
xavier_normal_ with gain
*
hidden_units = 1000
*
epochs=10000
*
learning_rate=0.01
*
~ 40 minutes training
*
Lightweight
This config is very fast to train (~30 seconds) but fluctuates in performance depending (I guess)
on the initial randomization. Results go from 70% accuracy on test upto 80%, many times being 72% or 76%
*
xavier_normal_ with gain
*
hidden_units = 10
*
epochs=1000
*
learning_rate=0.01
*
~ 40 minutes training
# Ideas for next steps
*
Implement my own ReLU and Sigmoid functions with Autograd
https://github.com/jcjohnson/pytorch-examples#pytorch-defining-new-autograd-functions
*
Generalising nn1hl to accept L hidden layers
*
Extracting the logic of the optimizer away from the model. Get inspired by torch.Optimizer
https://pytorch.org/docs/stable/optim.html
nn1hl.py
0 → 100644
View file @
9ad8e532
"""
My attempt at reproducing Coursera's logistic regresion example with autograd
"""
import
numpy
as
np
import
torch
class
NN1HiddenLayerModel
():
def
__init__
(
self
):
self
.
w1
=
None
self
.
b1
=
None
self
.
w2
=
None
self
.
b2
=
None
self
.
hidden_units
=
10
def
train
(
self
,
X
,
Y
,
epochs
=
1000
,
learning_rate
=
0.5
):
self
.
w1
=
torch
.
zeros
((
X
.
shape
[
0
],
self
.
hidden_units
),
requires_grad
=
True
,
dtype
=
torch
.
float64
)
self
.
b1
=
torch
.
zeros
((
self
.
hidden_units
,
1
)
,
requires_grad
=
True
,
dtype
=
torch
.
double
)
self
.
w2
=
torch
.
zeros
((
self
.
hidden_units
,
1
)
,
requires_grad
=
True
,
dtype
=
torch
.
float64
)
self
.
b2
=
torch
.
zeros
((
1
,
1
)
,
requires_grad
=
True
,
dtype
=
torch
.
double
)
# torch.nn.init.normal_(self.w1, mean=0, std=0.01)
# torch.nn.init.normal_(self.b1, mean=0, std=0.01)
# torch.nn.init.normal_(self.w2, mean=0, std=0.01)
# torch.nn.init.normal_(self.b2, mean=0, std=0.01)
torch
.
nn
.
init
.
xavier_normal_
(
self
.
w1
,
gain
=
torch
.
nn
.
init
.
calculate_gain
(
'relu'
))
torch
.
nn
.
init
.
xavier_normal_
(
self
.
b1
,
gain
=
torch
.
nn
.
init
.
calculate_gain
(
'relu'
))
torch
.
nn
.
init
.
xavier_normal_
(
self
.
w2
,
gain
=
torch
.
nn
.
init
.
calculate_gain
(
'sigmoid'
))
torch
.
nn
.
init
.
xavier_normal_
(
self
.
b2
,
gain
=
torch
.
nn
.
init
.
calculate_gain
(
'sigmoid'
))
# torch.nn.init.xavier_uniform_(self.w1, gain=torch.nn.init.calculate_gain('relu'))
# torch.nn.init.xavier_uniform_(self.b1, gain=torch.nn.init.calculate_gain('relu'))
# torch.nn.init.xavier_uniform_(self.w2, gain=torch.nn.init.calculate_gain('sigmoid'))
# torch.nn.init.xavier_uniform_(self.b2, gain=torch.nn.init.calculate_gain('sigmoid'))
m
=
X
.
shape
[
1
]
for
i
in
range
(
epochs
):
z1
=
self
.
w1
.
transpose
(
0
,
1
).
mm
(
X
).
add
(
self
.
b1
)
A1
=
z1
.
clamp
(
min
=
0
)
# ReLu
z2
=
self
.
w2
.
transpose
(
0
,
1
).
mm
(
A1
).
add
(
self
.
b2
)
A2
=
torch
.
sigmoid
(
z2
)
loss
=
Y
.
mul
(
A2
.
log
())
+
(
1
-
Y
).
mul
((
1
-
A2
).
log
())
loss
=
loss
.
sum
()
/
(
-
m
)
loss
.
backward
()
# import pdb; pdb.set_trace()
with
torch
.
no_grad
():
self
.
w1
-=
learning_rate
*
self
.
w1
.
grad
self
.
b1
-=
learning_rate
*
self
.
b1
.
grad
self
.
w2
-=
learning_rate
*
self
.
w2
.
grad
self
.
b2
-=
learning_rate
*
self
.
b2
.
grad
# Manually zero the gradients after running the backward pass
self
.
w1
.
grad
.
zero_
()
self
.
b1
.
grad
.
zero_
()
self
.
w2
.
grad
.
zero_
()
self
.
b2
.
grad
.
zero_
()
if
i
%
100
==
0
:
# import pdb; pdb.set_trace()
print
(
"Loss after iteration %i: %f"
%
(
i
,
loss
))
def
predict
(
self
,
X
):
with
torch
.
no_grad
():
z1
=
self
.
w1
.
transpose
(
0
,
1
).
mm
(
X
).
add
(
self
.
b1
)
A1
=
z1
.
clamp
(
min
=
0
)
# ReLu
z2
=
self
.
w2
.
transpose
(
0
,
1
).
mm
(
A1
).
add
(
self
.
b2
)
Y_pred
=
torch
.
sigmoid
(
z2
)
Y_pred
[
Y_pred
<
0.5
]
=
0
Y_pred
[
Y_pred
>=
0.5
]
=
1
return
Y_pred
def
benchmark
(
self
,
X
,
Y
):
Y_pred
=
self
.
predict
(
X
)
accuracy
=
np
.
mean
(
np
.
abs
(
Y_pred
.
numpy
()
-
Y
.
numpy
()))
accuracy
=
100
-
100
*
accuracy
return
accuracy
if
__name__
==
"__main__"
:
from
coursera01w02.lr_utils
import
load_dataset
train_set_x_orig
,
train_set_y
,
test_set_x_orig
,
test_set_y
,
classes
=
load_dataset
()
train_set_x_flatten
=
train_set_x_orig
.
reshape
(
train_set_x_orig
.
shape
[
0
],
-
1
).
T
test_set_x_flatten
=
test_set_x_orig
.
reshape
(
test_set_x_orig
.
shape
[
0
],
-
1
).
T
train_set_x
=
train_set_x_flatten
/
255.
test_set_x
=
test_set_x_flatten
/
255.
train_set_x
=
torch
.
from_numpy
(
train_set_x
).
type
(
torch
.
float64
)
train_set_y
=
torch
.
from_numpy
(
train_set_y
).
type
(
torch
.
float64
)
test_set_x
=
torch
.
from_numpy
(
test_set_x
).
type
(
torch
.
float64
)
test_set_y
=
torch
.
from_numpy
(
test_set_y
).
type
(
torch
.
float64
)
# d = model(train_set_x, train_set_y, test_set_x, test_set_y, num_iterations = 2000, learning_rate = 0.005, print_cost = True)
nn1hl
=
NN1HiddenLayerModel
()
nn1hl
.
train
(
train_set_x
,
train_set_y
,
epochs
=
1000
,
learning_rate
=
0.01
)
train_accuracy
=
nn1hl
.
benchmark
(
train_set_x
,
train_set_y
)
test_accuracy
=
nn1hl
.
benchmark
(
test_set_x
,
test_set_y
)
print
(
f
"train accuracy:
{
train_accuracy
}
"
)
print
(
f
"test accuracy:
{
test_accuracy
}
"
)
\ No newline at end of file
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment