리프 텐서의 값을 업데이트하는 적절한 방법은 무엇입니까(예: 경사하강법 업데이트 단계 중) (What's the proper way to update a leaf tensor's values (e.g. during the update step of gradient descent))

문제 설명

장난감 예제

일부 장난감 데이터에 선형 회귀(mx + b)를 맞추려고 시도하는 매우 간단한 경사 하강법 구현을 고려하십시오.

import torch

# Make some data
torch.manual_seed(0)
X = torch.rand(35) * 5
Y = 3 * X + torch.rand(35)

# Initialize m and b
m = torch.rand(size=(1,), requires_grad=True)
b = torch.rand(size=(1,), requires_grad=True)

# Pass 1
yhat = X * m + b    #  Calculate yhat
loss = torch.sqrt(torch.mean((yhat ‑ Y)**2)) # Calculate the loss
loss.backward()     # Reverse mode differentiation
m = m ‑ 0.1*m.grad  # update m
b = b ‑ 0.1*b.grad  # update b
m.grad = None       # zero out m gradient
b.grad = None       # zero out b gradient

# Pass 2
yhat = X * m + b    #  Calculate yhat
loss = torch.sqrt(torch.mean((yhat ‑ Y)**2)) # Calculate the loss
loss.backward()     # Reverse mode differentiation
m = m ‑ 0.1*m.grad  # ERROR

첫 번째 단계 잘 작동하지만 마지막 줄에 두 번째 패스 오류가 있습니다. m = m ‑ 0.1*m.grad.

Error

/usr/local/lib/python3.7/dist‑packages/torch/_tensor.py:1013: UserWarning: The .grad attribute of a Tensor that is not a leaf Tensor is being accessed. Its .grad attribute won't be populated during autograd.backward(). If you indeed want the .grad field to be populated for a non‑leaf Tensor, use .retain_grad() on the non‑leaf Tensor. If you access the non‑leaf Tensor by mistake, make sure you access the leaf Tensor instead. See github.com/pytorch/pytorch/pull/30531 for more informations. (Triggered internally at  aten/src/ATen/core/TensorBody.h:417.)
  return self._grad

이유에 대한 이해 이것은 패스 1 동안 이 줄

m = m ‑ 0.1*m.grad

복사 m를 새로운 텐서(즉, 완전히 별도의 메모리 블록)로 만드는 것입니다. 그래서 리프 텐서에서 리프가 아닌 텐서로 바뀝니다.

# Pass 1
...
print(f"{m.is_leaf}")  # True
m = m ‑ 0.1*m.grad  
print(f"{m.is_leaf}")  # False

업데이트를 어떻게 수행합니까?

사용할 수 있다는 언급을 본 적이 있습니다. m 라인을 따라 뭔가.

 참조 솔루션
방법 1:
 You're observation is correct, in order to perform the update you should:
 
 
Apply the modification with in‑place operators.
 
 
Wrap the calls with torch.no_grad context manager.
 

 </ol>
 
For instance:
 with torch.no_grad():
    m ‑= 0.1*m.grad  # update m
    b ‑= 0.1*b.grad  # update b
(by Ben、Ivan)
 
참조 문서
  What's the proper way to update a leaf tensor's values (e.g. during the update step of gradient descent) (CC BY‑SA 2.5/3.0/4.0)

리프 텐서의 값을 업데이트하는 적절한 방법은 무엇입니까(예: 경사하강법 업데이트 단계 중) (What's the proper way to update a leaf tensor's values (e.g. during the update step of gradient descent))

문제 설명

장난감 예제

Error

업데이트를 어떻게 수행합니까?

참조 솔루션

방법 1:

참조 문서

관련 질문

코멘트

리프 텐서의 값을 업데이트하는 적절한 방법은 무엇입니까(예: 경사하강법 업데이트 단계 중) (What's the proper way to update a leaf tensor's values (e.g. during the update step of gradient descent))

문제 설명

장난감 예제

Error

업데이트를 어떻게 수행합니까?

참조 솔루션

방법 1:

참조 문서

관련 질문

Python - 파일 이름에 특수 문자가 있는 파일의 이름을 바꿀 수 없습니다. (Python - Unable to rename a file with special characters in the file name)

구조화된 배열의 dtype을 변경하면 문자열 데이터가 0이 됩니다. (Changing dtype of structured array zeros out string data)

목록 목록의 효과적인 구현 (Effective implementation of list of lists)

for 루프를 중단하지 않고 if 문을 중지하고 다른 if에 영향을 줍니다. (Stop if statement without breaking for loop and affect other ifs)

기본 숫자를 10 ^ 9 이상으로 늘리면 코드가 작동하지 않습니다. (Code fails to work when i increase the base numbers to anything over 10 ^ 9)

사용자 지정 대화 상자 PyQT5를 닫고 데이터 가져오기 (Close and get data from a custom dialog PyQT5)

Enthought Canopy의 Python: csv 파일 조작 (Python in Enthought Canopy: manipulating csv files)

학생의 이름을 인쇄하려고 하는 것이 잘못된 것은 무엇입니까? (What is wrong with trying to print the name of the student?)

다단계 열 테이블에 부분합 열 추가 (Adding a subtotal column to a multilevel column table)

여러 함수의 변수를 다른 함수로 사용 (Use variables from multiple functions into another function)

리프 텐서의 값을 업데이트하는 적절한 방법은 무엇입니까(예: 경사하강법 업데이트 단계 중) (What's the proper way to update a leaf tensor's values (e.g. during the update step of gradient descent))

Boto3: 조직 단위의 AMI에 시작 권한을 추가하려고 하면 ParamValidationError가 발생합니다. (Boto3: trying to add launch permission to AMI for an organizational unit raises ParamValidationError)

코멘트