concept autograd in category deep learning

appears as: utograd, autograd
Deep Learning with PyTorch

This is an excerpt from Manning's book Deep Learning with PyTorch.

  • JAX, a library by Google that was developed independently from TensorFlow, has started gaining traction as a NumPy equivalent with GPU, autograd and JIT capabilities.
  • This is when PyTorch tensors come to the rescue, with a PyTorch component called autograd. Chapter 3 presented a comprehensive overview of what tensors are and what functions we can call on them. We left out one very interesting aspect, however: PyTorch tensors can remember where they come from, in terms of the operations and parent tensors that originated them, and they can automatically provide the chain of derivatives of such operations with respect to their inputs. This means we won’t need to derive our model by hand;10 given a forward expression, no matter how nested, PyTorch will automatically provide the gradient of that expression with respect to its input parameters.

    Figure 5.10 The forward graph and backward graph of the model as computed with autograd

    In order to address this, PyTorch allows us to switch off autograd when we don’t need it, using the torch.no_grad context manager.12 We won’t see any meaningful advantage in terms of speed or memory consumption on our small problem. However, for larger models, the differences can add up. We can make sure this works by checking the value of the requires_grad attribute on the val_loss tensor:

    # In[16]:
    def training_loop(n_epochs, optimizer, params, train_t_u, val_t_u,
                      train_t_c, val_t_c):
        for epoch in range(1, n_epochs + 1):
            train_t_p = model(train_t_u, *params)
            train_loss = loss_fn(train_t_p, train_t_c)
     
            with torch.no_grad():                         #1
                val_t_p = model(val_t_u, *params)
                val_loss = loss_fn(val_t_p, val_t_c)
                assert val_loss.requires_grad == False    #2
     
            optimizer.zero_grad()
            train_loss.backward()
            optimizer.step()

    Using the related set_grad_enabled context, we can also condition the code to run with autograd enabled or disabled, according to a Boolean expression--typically indicating whether we are running in training or inference mode. We could, for instance, define a calc_forward function that takes data as input and runs model and loss_fn with or without autograd according to a Boolean train_is argument:

    Grokking Deep Learning

    This is an excerpt from Manning's book Grokking Deep Learning.

    Perhaps the most elegant part of this form of autograd is that it works recursively as well, because each vector calls .backward() on all of its self.creators:

    sitemap

    Unable to load book!

    The book could not be loaded.

    (try again in a couple of minutes)

    manning.com homepage
    test yourself with a liveTest