훈련된 신경망을 사용하여 이미지에서 여러 객체를 식별하는 방법은 무엇입니까? (How do you use a trained neural net to identify multiple objects in an image?)

문제 설명

저는 신경망을 탐색해 왔으며 개별 사진에 특정 항목을 지정하는 방식으로 내 자신의 이미지에서도 네트워크를 성공적으로 훈련할 수 있었지만 훈련된 네트워크를 사용하여 식별하고 아마도 하나의 이미지에서 여러 객체를 반환합니다. 예를 들어, 고양이와 개를 훈련시켰고 하나의 이미지에 여러 고양이와 개가 있는 경우 훈련된 네트워크를 어떻게 적용하고 이미지에서 위치를 반환할까요?

여기에 제가 따라한 주요 튜토리얼이 있습니다. Python 구현: http://machinelearningmastery. com/object‑recognition‑convolutional‑neural‑networks‑keras‑deep‑learning‑library/

일반적인 대답으로 충분할 것입니다. 예를 들어 이미지 위의 슬라이딩 윈도우가 이에 대한 최상의 솔루션입니까 아니면 더 쉬운 것이 있습니까?

특정 예(특히 파이썬에서)를 주시면 감사하겠습니다. 저는 대부분의 이미지 작업에 matplotlib를 사용하고 있으므로 PIL 슬라이싱을 피하고 싶습니다.

감사합니다!

참조 솔루션

방법 1:

As you want to use your existing trained n/w:

Brute Sliding window: you will have to process many windows (slide by pixel based on image size) if you don't know the size and location of the object in the image, and each window may produce different outcomes and may be one or few of those are the final required results, do you see how the complexity increases. There will be difficulty in identifying the actual required outcomes among many.
Preprocessing: images can be preprocessed before feeding it to the network. For instance, take an image with a monkey and a snake, calculate energy (Sobel et.al) of the image. Monkeys footprint in the image is more like round balloon (more area) and snake would be thread‑like (less area), based on this have a python script to crop the image to that particular section, then feed this to the n/w. You can think of other preprocessing techniques.

If you are open to other n/w's, check out CRF as Recurrent Neural Networks. Ex: https://github.com/torrvision/crfasrnn

Hope this helps.

(by Beutler、Enigma)

참조 문서

How do you use a trained neural net to identify multiple objects in an image? (CC BY‑SA 2.5/3.0/4.0)

훈련된 신경망을 사용하여 이미지에서 여러 객체를 식별하는 방법은 무엇입니까? (How do you use a trained neural net to identify multiple objects in an image?)

문제 설명

참조 솔루션

방법 1:

참조 문서

관련 질문

코멘트

훈련된 신경망을 사용하여 이미지에서 여러 객체를 식별하는 방법은 무엇입니까? (How do you use a trained neural net to identify multiple objects in an image?)

문제 설명

참조 솔루션

방법 1:

참조 문서

관련 질문

Python - 파일 이름에 특수 문자가 있는 파일의 이름을 바꿀 수 없습니다. (Python - Unable to rename a file with special characters in the file name)

구조화된 배열의 dtype을 변경하면 문자열 데이터가 0이 됩니다. (Changing dtype of structured array zeros out string data)

목록 목록의 효과적인 구현 (Effective implementation of list of lists)

for 루프를 중단하지 않고 if 문을 중지하고 다른 if에 영향을 줍니다. (Stop if statement without breaking for loop and affect other ifs)

기본 숫자를 10 ^ 9 이상으로 늘리면 코드가 작동하지 않습니다. (Code fails to work when i increase the base numbers to anything over 10 ^ 9)

사용자 지정 대화 상자 PyQT5를 닫고 데이터 가져오기 (Close and get data from a custom dialog PyQT5)

Enthought Canopy의 Python: csv 파일 조작 (Python in Enthought Canopy: manipulating csv files)

학생의 이름을 인쇄하려고 하는 것이 잘못된 것은 무엇입니까? (What is wrong with trying to print the name of the student?)

다단계 열 테이블에 부분합 열 추가 (Adding a subtotal column to a multilevel column table)

여러 함수의 변수를 다른 함수로 사용 (Use variables from multiple functions into another function)

리프 텐서의 값을 업데이트하는 적절한 방법은 무엇입니까(예: 경사하강법 업데이트 단계 중) (What's the proper way to update a leaf tensor's values (e.g. during the update step of gradient descent))

Boto3: 조직 단위의 AMI에 시작 권한을 추가하려고 하면 ParamValidationError가 발생합니다. (Boto3: trying to add launch permission to AMI for an organizational unit raises ParamValidationError)

코멘트