-
Notifications
You must be signed in to change notification settings - Fork 127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Too low (frame) FPS compared to documentation #583
Comments
Hi there |
Model Type: Roboflow 3.0 Instance Segmentation (Accurate) Is that what you mean? |
no, I mean what is the name and version of the model you use |
You mean from my login - its not public? |
ok, I will take a look at metadata and try to reproduce problem on similar model to profile the server |
Ty, |
ok, that would be even better |
will check and send a link |
that should be the model with similar characteristics: I just checked the number from our benchmarks and last time we checked it was faster than you report, so I will redo test once you confirm this 5 FPS on public model and we will see |
It is still 5 FPS, i.e. 200ms/image on the the public model. I added some info. So somehow I am stuck with 5FPS on an Orin Nano. I am glad for any ideas. Looking forward to next week. Cheers, Using "yolov8s-seg-640"➜ ~ docker run --net=host --runtime=nvidia --env INSTANCES=2 -d roboflow/roboflow-inference-server-jetson-5.1.1
2c64571536487f15d998db82ed931cc3daed943db4c8958e3e09cc9e4503f101
➜ ~ docker logs -f upbeat_hoover
UserWarning: Unable to import Axes3D. This may be due to multiple versions of Matplotlib being installed (e.g. as a system package and as a pip package). As a result, the 3D projection is not available.
SupervisionWarnings: BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
UserWarning: Field name "schema" in "WorkflowsBlocksSchemaDescription" shadows an attribute in parent "BaseModel"
INFO: Started server process [19]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:9001 (Press CTRL+C to quit)
INFO: 192.168.2.12:45126 - "GET /model/registry HTTP/1.1" 200 OK
UserWarning: Specified provider 'OpenVINOExecutionProvider' is not in available provider names.Available providers: 'TensorrtExecutionProvider, CUDAExecutionProvider, CPUExecutionProvider'
INFO: 192.168.2.12:45132 - "POST /model/add HTTP/1.1" 200 OK
INFO: 192.168.2.12:43590 - "POST /infer/instance_segmentation HTTP/1.1" 200 OK
INFO: 192.168.2.12:43598 - "GET /model/registry HTTP/1.1" 200 OK
INFO: 192.168.2.12:43614 - "POST /infer/instance_segmentation HTTP/1.1" 200 OK
# goes on forever.... Using "Ran0mMod3lID"I checked that the model ID is not arbitrary. So it uses your provided model: # the docker container confirms a random modelID as invalided
inference.core.exceptions.InvalidModelIDError: Model ID: `Ran0mMod3lID` is invalid.
INFO: 192.168.2.12:47106 - "POST /model/add HTTP/1.1" 400 Bad Request |
ok, will verify on my end and reach you back |
The benchmarks you're citing are for a nano-sized object detection model vs a small-sized instance segmentation model. Should be |
@yeldarby your proposed Model @PawelPeczek-Roboflow I checked what a reduced resolution changes. client = InferenceHTTPClient(
api_url="http://localhost:9001",
api_key=ROBOFLOW_API_KEY,
)
# 100 times less pixels
img = cv2.resize(img, (0, 0), fx = 0.1, fy = 0.1)
results = client.infer(img, model_id=ROBOFLOW_MODEL) it doubled the frame rate to ~11fps. I don´t know how sensible that is - just fyi. |
just checking at my jetson now - my first guess was that camera may be providing high res frames, but let's see what my test shows |
Ok, seems that @clausMeko is right with his results, those are benchmarks for segmentation models:
Docs are probably referring to object detection models which looks like that:
|
@PawelPeczek-Roboflow so you would recommend choosing object detection over segmentation models if it is about performance? |
That really depends on your use case - some tasks would be possible to be performed by both types of models, some not. |
@PawelPeczek-Roboflow I would like to use concurrency for inference. I.e. if a request takes ~100ms then I could do 3 requests every 33ms etc. Do you you have a python code-snippet to do that? I saw your envVar |
Sorry for late response, I believe we do not have script to distribute requests. Cannot really find this |
Hi @clausMeko, I looked at your example code and thought maybe you would like to try from functools import partial
from typing import Any, Dict, Optional, Tuple, Union
import cv2 as cv
from inference.core.interfaces.camera.entities import (
SourceProperties,
VideoFrame,
VideoFrameProducer,
)
import numpy as np
from pypylon import pylon
import supervision as sv
from inference.core.interfaces.stream.inference_pipeline import InferencePipeline
class BaslerFrameProducer(VideoFrameProducer):
def __init__(
self,
):
self._camera: pylon.InstantCamera = pylon.InstantCamera(pylon.TlFactory.GetInstance().CreateFirstDevice())
self._camera.StartGrabbing(pylon.GrabStrategy_LatestImageOnly)
self._converter = pylon.ImageFormatConverter()
def grab(self) -> bool:
grabResult = self._camera.RetrieveResult(5000, pylon.TimeoutHandling_ThrowException)
grab_succeeded = False
if grabResult.GrabSucceeded():
grab_succeeded = True
grabResult.Release()
return grab_succeeded
def retrieve(self) -> Tuple[bool, Optional[np.ndarray]]:
grabResult = self._camera.RetrieveResult(5000, pylon.TimeoutHandling_ThrowException)
if not grabResult.GrabSucceeded():
grabResult.Release()
return False, None
image = self._converter.Convert(grabResult)
img = image.GetArray()
grabResult.Release()
return True, img
def release(self):
self._is_opened = False
def isOpened(self) -> bool:
return self._camera.IsGrabbing()
def discover_source_properties(self) -> SourceProperties:
grabResult = self._camera.RetrieveResult(5000, pylon.TimeoutHandling_ThrowException)
if not grabResult.GrabSucceeded():
grabResult.Release()
return False, None
image = self._converter.Convert(grabResult)
img = image.GetArray()
grabResult.Release()
h, w, *_ = img.shape
return SourceProperties(
width=w,
height=h,
total_frames=-1,
is_file=False,
fps=1, # TODO: can't see FPS in pylon.InstantCamera, we measure FPS in inference pipeline and expose as Frame property
is_reconnectable=False,
)
def initialize_source_properties(self, properties: Dict[str, float]):
pass
basler_producer = partial(
BaslerFrameProducer,
)
box_annotator = sv.BoundingBoxAnnotator()
label_annotator = sv.LabelAnnotator()
def custom_sink(prediction: Dict[str, Any], video_frame: VideoFrame) -> None:
detections = sv.Detections.from_inference(prediction)
labels = [f"#{class_name}" for class_name in detections["class_name"]]
annotated_frame = box_annotator.annotate(
video_frame.image.copy(), detections=detections
)
annotated_frame = label_annotator.annotate(
annotated_frame, detections=detections, labels=labels
)
cv.imshow("", annotated_frame)
cv.waitKey(1)
inference_pipeline = InferencePipeline.init(
video_reference=basler_producer,
model_id="yolov8n-640",
on_prediction=custom_sink,
)
inference_pipeline.start()
inference_pipeline.join() |
Search before asking
Bug
Set Up
I use a Basler Camera acA1920-40uc.
It provides ~50 fps as
cv2.Image
via opencv. I use your sdk to post those images.On the same device: jetson orin nano (no network latency) docker runs the
inference-server-jets-5.1.1
image.For testing I ran the same setup on my notebook(dell precision 5570 - i7-12700H) with the cpu image.
Problem
The inference takes longer than expected ~200ms (self computed ~ 5 fps). This is disappointing for 2 reasons:
Question
Is there anything I am not considering so I can improve my performance?
Environment
roboflow/roboflow-inference-server-jetson-5.1.1:latest
Minimal Reproducible Example
Sorry - I merged 2 files if something seems odd.
Additional
No response
Are you willing to submit a PR?
The text was updated successfully, but these errors were encountered: