Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Another way of gradient backward backpropagation #81

Open
ickma2311 opened this issue Sep 19, 2024 · 2 comments
Open

Another way of gradient backward backpropagation #81

ickma2311 opened this issue Sep 19, 2024 · 2 comments

Comments

@ickma2311
Copy link

Hi @karpathy, I followed your code and created my own implementation. I also made some modifications that I believe enhance the code's clarity and understanding.

When forward(add operation as example):

out.backwards.extend([(self, 1), (other, 1)])

When backward:

    def backward(self):


        for node, partial_derivative in self.backwards:
            node.grad += self.grad * partial_derivative
            node.backward()
            if not isinstance(node, Parameter):
                node.zero_grad()

I tested it, and it appears to be working well. On the Iris dataset, I achieved an accuracy of 93%. I believe it could reach 100% if I use a more effective loss function.
My code: https://github.com/ickma/picograd

@dkgitcode
Copy link

In this implementation, you are calculating the gradients immediately rather than deferring them like Karpathy does. While it may enhance clarity, would this not add unnecessary computation during inference ? You are crossing bridges before you even get to them, and there are instances where you don't even need to cross the bridge (i.e forward prop).

@ickma2311
Copy link
Author

@dkgitcode yes, you are correct. My implementation is worse on efficiency.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants