Another way of gradient backward backpropagation #81

ickma2311 · 2024-09-19T08:34:27Z

Hi @karpathy, I followed your code and created my own implementation. I also made some modifications that I believe enhance the code's clarity and understanding.

When forward(add operation as example):

out.backwards.extend([(self, 1), (other, 1)])

When backward:

    def backward(self):


        for node, partial_derivative in self.backwards:
            node.grad += self.grad * partial_derivative
            node.backward()
            if not isinstance(node, Parameter):
                node.zero_grad()

I tested it, and it appears to be working well. On the Iris dataset, I achieved an accuracy of 93%. I believe it could reach 100% if I use a more effective loss function.
My code: https://github.com/ickma/picograd

The text was updated successfully, but these errors were encountered:

dkgitcode · 2024-09-23T09:22:06Z

In this implementation, you are calculating the gradients immediately rather than deferring them like Karpathy does. While it may enhance clarity, would this not add unnecessary computation during inference ? You are crossing bridges before you even get to them, and there are instances where you don't even need to cross the bridge (i.e forward prop).

ickma2311 · 2024-09-23T15:57:15Z

@dkgitcode yes, you are correct. My implementation is worse on efficiency.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Another way of gradient backward backpropagation #81

Another way of gradient backward backpropagation #81

ickma2311 commented Sep 19, 2024

dkgitcode commented Sep 23, 2024

ickma2311 commented Sep 23, 2024

Another way of gradient backward backpropagation #81

Another way of gradient backward backpropagation #81

Comments

ickma2311 commented Sep 19, 2024

dkgitcode commented Sep 23, 2024

ickma2311 commented Sep 23, 2024