使用OpenCV在图像上绘制十字

发布时间：2020-12-16 22:27:14 所属栏目：Python 来源：网络整理

导读：上下文：我正在执行对象本地化并希望实现抑制返回机制(即在红色边界框在触发操作之后的图像上绘制黑色十字.) 问题：我不知道如何准确地缩放与原始输入(init_input)相关的边界框(红色).如果理解了这种缩放,则应将黑色十字准确地放置在红色边界框的中间. 我目

上下文：我正在执行对象本地化并希望实现抑制返回机制(即在红色边界框在触发操作之后的图像上绘制黑色十字.)

问题：我不知道如何准确地缩放与原始输入(init_input)相关的边界框(红色).如果理解了这种缩放,则应将黑色十字准确地放置在红色边界框的中间.

我目前的此功能代码如下：

def IoR(b,init_input,prev_coord):
    """
    Inhibition-of-Return mechanism.

    Marks the region of the image covered by
    the bounding box with a black cross.

    :param b:
        The current bounding box represented as [x1,y1,x2,y2].

    :param init_input:
        The initial input volume of the current episode.

    :param prev_coord:
        The previous state's bounding box coordinates (x1,y2)
    """
    x1,y2 = prev_coord
    width = 12
    x_mid = (b[2] + b[0]) // 2
    y_mid = (b[3] + b[1]) // 2

    # Define vertical rectangle coordinates
    ver_x1 = int(((x_mid) * IMG_SIZE / (x2 - x1)) - width)
    ver_x2 = int(((x_mid) * IMG_SIZE / (x2 - x1)) + width)
    ver_y1 = int((b[1]) * IMG_SIZE / (y2 - y1))
    ver_y2 = int((b[3]) * IMG_SIZE / (y2 - y1))

    # Define horizontal rectangle coordinates
    hor_x1 = int((b[0]) * IMG_SIZE / (x2 - x1))
    hor_x2 = int((b[2]) * IMG_SIZE / (x2 - x1))
    hor_y1 = int(((y_mid) * IMG_SIZE / (y2 - y1)) - width)
    hor_y2 = int(((y_mid) * IMG_SIZE / (y2 - y1)) + width)

    # Draw vertical rectangle
    cv2.rectangle(init_input,(ver_x1,ver_y1),(ver_x2,ver_y2),(0,0),-1)

    # Draw horizontal rectangle
    cv2.rectangle(init_input,(hor_x1,hor_y1),(hor_x2,hor_y2),-1)

期望的效果如下：

Desired

注意：我相信这个问题的复杂性是由于每次我采取行动(然后进入下一个状态)时图像被调整大小(到224,224,3).因此,必须从先前的状态缩放中提取用于确定缩放的“锚点”,如以下代码所示：

def next_state(init_input,b_prime,g):
    """
    Returns the observable region of the next state.

    Formats the next state's observable region,defined
    by b_prime,to be of dimension (224,3). Adding 16
    additional pixels of context around the original bounding box.
    The ground truth box must be reformatted according to the
    new observable region.

    IMG_SIZE = 224

    :param init_input:
        The initial input volume of the current episode.

    :param b_prime:
        The subsequent state's bounding box.

    :param g: (init_g)
        The initial ground truth box of the target object.
    """

    # Determine the pixel coordinates of the observable region for the following state
    context_pixels = 16
    x1 = max(b_prime[0] - context_pixels,0)
    y1 = max(b_prime[1] - context_pixels,0)
    x2 = min(b_prime[2] + context_pixels,IMG_SIZE)
    y2 = min(b_prime[3] + context_pixels,IMG_SIZE)

    # Determine observable region
    observable_region = cv2.resize(init_input[y1:y2,x1:x2],(224,224),interpolation=cv2.INTER_AREA)

    # Resize ground truth box
    g[0] = int((g[0] - x1) * IMG_SIZE / (x2 - x1))  # x1
    g[1] = int((g[1] - y1) * IMG_SIZE / (y2 - y1))  # y1
    g[2] = int((g[2] - x1) * IMG_SIZE / (x2 - x1))  # x2
    g[3] = int((g[3] - y1) * IMG_SIZE / (y2 - y1))  # y2

    return observable_region,g,(b_prime[0],b_prime[1],b_prime[2],b_prime[3])

说明：

存在状态t,其中代理正在预测目标对象的位置.目标对象有一个地面实况框(图中黄色,草图点缀),代理人当前的“本地化框”是红色边界框.说,在状态t,代理商决定最好向右移动.因此,边界框向右移动,然后下一个状态t’通过在红色边界框周围添加额外的16个像素的上下文,相对于该边界裁剪原始图像,然后升级裁剪来确定图像返回224,224维度.

假设代理现在确信其预测是准确的,因此它选择触发操作.这基本上意味着,结束当前目标对象的本地化事件并在代理预测对象所在的位置(即在红色边界框的中间)放置黑色十字.现在,由于当前状态在先前动作之后被裁剪之后被放大,因此必须相对于正常/原始/初始图像重新缩放边界框,然后可以将黑色十字准确地绘制到图像上.

在我的问题的背景下,状态之间的第一次重新缩放工作非常好(本文中的第二个代码).然而,缩小到正常并绘制黑色十字架是我似乎无法理解的问题.

这是一张希望有助于解释的图片：

Sketch example

以下是我当前解决方案的输出(请点击图片放大)：

Image example

最佳答案

我认为最好是全局保存坐标而不是使用一堆高级/缩小.它们让我很头疼,由于四舍五入可能会导致精确度下降.

也就是说,每次检测到某些内容时,首先将其转换为全局(原始图像)坐标.我在这里写了一个小的演示,模仿你的检测和触发行为.

初步检测：

enter image description here

放大,另一个检测：

enter image description here

放大,另一个检测：

enter image description here

放大,另一个检测：

enter image description here

放大回原始比例,检测盒位于正确的位置

enter image description here

码：

import cv2
import matplotlib.pyplot as plt

IMG_SIZE = 224

im = cv2.cvtColor(cv2.imread('lena.jpg'),cv2.COLOR_BGR2GRAY)
im = cv2.resize(im,(IMG_SIZE,IMG_SIZE))

# Your detector results
detected_region = [
    [(10,20),(80,100)],[(50,(220,190)],[(100,143),(180,200)],[(110,45),150)]
]

# Global states
x_scale = 1.0
y_scale = 1.0
x_shift = 0
y_shift = 0

x1,y1 = 0,0
x2,y2 = IMG_SIZE-1,IMG_SIZE-1
for region in detected_region:
    # Detection
    x_scale = IMG_SIZE / (x2-x1)
    y_scale = IMG_SIZE / (y2-y1)
    x_shift = x1
    y_shift = y1

    cur_im = cv2.resize(im[y1:y2,IMG_SIZE))

    # Assuming the detector return these results
    cv2.rectangle(cur_im,region[0],region[1],(255))

    plt.imshow(cur_im)
    plt.show()

    # Zooming in,using part of your code
    context_pixels = 16
    x1 = max(region[0][0] - context_pixels,0) / x_scale + x_shift
    y1 = max(region[0][1] - context_pixels,0) / y_scale + y_shift
    x2 = min(region[1][0] + context_pixels,IMG_SIZE) / x_scale + x_shift
    y2 = min(region[1][1] + context_pixels,IMG_SIZE) / y_scale + y_shift

    x1,y2 = int(x1),int(y1),int(x2),int(y2)


# Assuming the detector confirm its choice here
print('Confirmed detection: ',x1,y2)

# This time no padding
x1 = detected_region[-1][0][0] / x_scale + x_shift
y1 = detected_region[-1][0][1] / y_scale + y_shift
x2 = detected_region[-1][1][0] / x_scale + x_shift
y2 = detected_region[-1][1][1] / y_scale + y_shift
x1,int(y2)

cv2.rectangle(im,(x1,y1),(x2,y2),(255,0))
plt.imshow(im)
plt.show()

这还可以防止在调整大小的图像上调整大小,这可能会产生更多伪像并使检测器的性能变差.

（编辑：李大同）

【声明】本站内容均来自网络，其相关言论仅代表作者个人观点，不代表本站立场。若无意侵犯到您的权利，请及时与联系站长删除相关内容!