You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi @karpathy, I followed your code and created my own implementation. I also made some modifications that I believe enhance the code's clarity and understanding.
When forward(add operation as example):
out.backwards.extend([(self, 1), (other, 1)])
When backward:
def backward(self):
for node, partial_derivative in self.backwards:
node.grad += self.grad * partial_derivative
node.backward()
if not isinstance(node, Parameter):
node.zero_grad()
I tested it, and it appears to be working well. On the Iris dataset, I achieved an accuracy of 93%. I believe it could reach 100% if I use a more effective loss function.
My code: https://github.com/ickma/picograd
The text was updated successfully, but these errors were encountered:
In this implementation, you are calculating the gradients immediately rather than deferring them like Karpathy does. While it may enhance clarity, would this not add unnecessary computation during inference ? You are crossing bridges before you even get to them, and there are instances where you don't even need to cross the bridge (i.e forward prop).
Hi @karpathy, I followed your code and created my own implementation. I also made some modifications that I believe enhance the code's clarity and understanding.
When forward(add operation as example):
When backward:
I tested it, and it appears to be working well. On the Iris dataset, I achieved an accuracy of 93%. I believe it could reach 100% if I use a more effective loss function.
My code: https://github.com/ickma/picograd
The text was updated successfully, but these errors were encountered: