2D 阵列中的峰值检测

我正在帮助一家兽医诊所测量狗爪下的压力。我使用 Python 进行数据分析，现在我不得不试图将爪子分成（解剖学）子区域。

我制作了每个爪子的 2D 阵列，它由爪子随时间加载的每个传感器的最大值组成。这是一个爪子的例子，我用 Excel 绘制了我想要 “检测” 的区域。这些是传感器周围的 2×2 个盒子，具有局部最大值，它们一起具有最大的总和。

替代文字

所以我尝试了一些实验并决定只查找每列和每行的最大值（由于爪子的形状，不能在一个方向上查看）。这似乎可以很好地 “检测” 单独脚趾的位置，但它也标记了相邻的传感器。

替代文字

那么告诉 Python 哪些最大值是我想要的最好的方法是什么？

注意：2x2 正方形不能重叠，因为它们必须是单独的脚趾！

我也采用 2x2 作为方便，欢迎任何更高级的解决方案，但我只是一个人类运动科学家，所以我既不是真正的程序员也不是数学家，所以请保持 “简单”。

这是一个可以用np.loadtxt加载的版本

结果

所以我尝试了 @jextee 的解决方案（见下面的结果）。正如你所看到的，它在前爪上很有效，但后腿的效果不太好。

更具体地说，它无法识别出第四个脚趾的小峰值。这显然是循环看起来自上而下朝向最低值的事实所固有的，而不考虑这是什么。

有谁知道如何调整 @jextee 的算法，以便它也可以找到第 4 个脚趾？

替代文字

由于我还没有处理任何其他试验，我不能提供任何其他样品。但我之前提供的数据是每只爪子的平均值。该文件是一个数组，其最大数据为 9 个爪子，它们与盘子接触的顺序。

该图像显示了它们如何在空间上展开。

替代文字

更新：

我已经为任何感兴趣的人建立了一个博客 ，我已经设置了一个包含所有原始测量值的 SkyDrive。所以对于要求更多数据的人来说：给你更大的力量！

新更新：

所以在我得到关于爪子检测和爪子分类的问题的帮助后，我终于能够检查每个爪子的脚趾检测！事实证明，除了像我自己的例子中那样大小的爪子之外，它在任何东西上都不能很好地工作。事后看来，任意选择 2x2 是我自己的错。

这是一个错误的例子：钉子被识别为脚趾，“脚跟” 如此宽，它被识别两次！

替代文字

爪子太大，因此在没有重叠的情况下采用 2x2 尺寸会导致一些脚趾被检测到两次。相反，在小型犬中，它经常无法找到第五个脚趾，我怀疑它是由 2x2 区域太大引起的。

在对我的所有测量结果进行了当前的解决方案之后，我得出了令人吃惊的结论：几乎所有的小型犬都没有找到第 5 个脚趾，并且对于大型犬的 50％以上的影响它会发现更多！

显然我需要改变它。我自己的猜测是，对于小型犬而言， neighborhood的大小会变小，而对于大型犬来说则更大。但是generate_binary_structure不会让我改变数组的大小。

因此，我希望其他人有更好的建议来定位脚趾，也许脚趾区域尺寸与爪子尺寸一致？

python image-processing

答案

我使用局部最大滤波器检测到峰值。以下是您的第一个 4 爪数据集的结果：峰值检测结果

我还在 9 个爪子的第二个数据集上运行它，它也运行良好。

这是你如何做到的：

import numpy as np
from scipy.ndimage.filters import maximum_filter
from scipy.ndimage.morphology import generate_binary_structure, binary_erosion
import matplotlib.pyplot as pp

#for some reason I had to reshape. Numpy ignored the shape header.
paws_data = np.loadtxt("paws.txt").reshape(4,11,14)

#getting a list of images
paws = [p.squeeze() for p in np.vsplit(paws_data,4)]


def detect_peaks(image):
    """
    Takes an image and detect the peaks usingthe local maximum filter.
    Returns a boolean mask of the peaks (i.e. 1 when
    the pixel's value is the neighborhood maximum, 0 otherwise)
    """

    # define an 8-connected neighborhood
    neighborhood = generate_binary_structure(2,2)

    #apply the local maximum filter; all pixel of maximal value 
    #in their neighborhood are set to 1
    local_max = maximum_filter(image, footprint=neighborhood)==image
    #local_max is a mask that contains the peaks we are 
    #looking for, but also the background.
    #In order to isolate the peaks we must remove the background from the mask.

    #we create the mask of the background
    background = (image==0)

    #a little technicality: we must erode the background in order to 
    #successfully subtract it form local_max, otherwise a line will 
    #appear along the background border (artifact of the local maximum filter)
    eroded_background = binary_erosion(background, structure=neighborhood, border_value=1)

    #we obtain the final mask, containing only peaks, 
    #by removing the background from the local_max mask (xor operation)
    detected_peaks = local_max ^ eroded_background

    return detected_peaks


#applying the detection and plotting results
for i, paw in enumerate(paws):
    detected_peaks = detect_peaks(paw)
    pp.subplot(4,2,(2*i+1))
    pp.imshow(paw)
    pp.subplot(4,2,(2*i+2) )
    pp.imshow(detected_peaks)

pp.show()

您需要做的就是在掩码上使用scipy.ndimage.measurements.label来标记所有不同的对象。然后你就可以单独玩它们了。

请注意 ，该方法效果很好，因为背景不嘈杂。如果是，你会在后台检测到一堆其他不需要的峰值。另一个重要因素是邻里的规模。如果峰值大小发生变化，则需要对其进行调整（应保持大致成比例）。

解

数据文件： paw.txt 。源代码：

from scipy import *
from operator import itemgetter

n = 5  # how many fingers are we looking for

d = loadtxt("paw.txt")
width, height = d.shape

# Create an array where every element is a sum of 2x2 squares.

fourSums = d[:-1,:-1] + d[1:,:-1] + d[1:,1:] + d[:-1,1:]

# Find positions of the fingers.

# Pair each sum with its position number (from 0 to width*height-1),

pairs = zip(arange(width*height), fourSums.flatten())

# Sort by descending sum value, filter overlapping squares

def drop_overlapping(pairs):
    no_overlaps = []
    def does_not_overlap(p1, p2):
        i1, i2 = p1[0], p2[0]
        r1, col1 = i1 / (width-1), i1 % (width-1)
        r2, col2 = i2 / (width-1), i2 % (width-1)
        return (max(abs(r1-r2),abs(col1-col2)) >= 2)
    for p in pairs:
        if all(map(lambda prev: does_not_overlap(p,prev), no_overlaps)):
            no_overlaps.append(p)
    return no_overlaps

pairs2 = drop_overlapping(sorted(pairs, key=itemgetter(1), reverse=True))

# Take the first n with the heighest values

positions = pairs2[:n]

# Print results

print d, "\n"

for i, val in positions:
    row = i / (width-1)
    column = i % (width-1)
    print "sum = %f @ %d,%d (%d)" % (val, row, column, i)
    print d[row:row+2,column:column+2], "\n"

输出没有重叠方块。似乎在您的示例中选择了相同的区域。

一些评论

棘手的部分是计算所有 2x2 平方的总和。我假设你需要所有这些，所以可能会有一些重叠。我使用切片从原始 2D 数组中剪切第一个 / 最后一个列和行，然后将它们重叠在一起并计算总和。

为了更好地理解它，对 3x3 阵列进行成像：

>>> a = arange(9).reshape(3,3) ; a
array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])

然后你可以拿它的切片：

>>> a[:-1,:-1]
array([[0, 1],
       [3, 4]])
>>> a[1:,:-1]
array([[3, 4],
       [6, 7]])
>>> a[:-1,1:]
array([[1, 2],
       [4, 5]])
>>> a[1:,1:]
array([[4, 5],
       [7, 8]])

现在想象一下，将它们叠加在另一个之上，并在相同位置求和元素。这些总和与 2x2 正方形的总和相同，左上角位于同一位置：

>>> sums = a[:-1,:-1] + a[1:,:-1] + a[:-1,1:] + a[1:,1:]; sums
array([[ 8, 12],
       [20, 24]])

如果总和超过 2x2 平方，则可以使用max查找最大值，或sort ，或sorted以查找峰值。

为了记住峰值的位置，我将每个值（总和）与其在序列阵列中的序数位置耦合（参见zip ）。然后我打印结果时再次计算行 / 列位置。

笔记

我允许 2x2 方格重叠。编辑版本过滤掉其中一些版本，使得结果中仅出现非重叠的正方形。

选择手指（一个想法）

另一个问题是如何从所有峰中选择可能是手指的东西。我有一个想法可能会或可能不会奏效。我现在没时间实现它，所以只是伪代码。

我注意到如果前手指几乎保持完美的圆形，则后指应位于该圆圈的内侧。而且，前指或多或少地等间隔。我们可能会尝试使用这些启发式属性来检测手指。

伪代码：

select the top N finger candidates (not too many, 10 or 12)
consider all possible combinations of 5 out of N (use itertools.combinations)
for each combination of 5 fingers:
    for each finger out of 5:
        fit the best circle to the remaining 4
        => position of the center, radius
        check if the selected finger is inside of the circle
        check if the remaining four are evenly spread
        (for example, consider angles from the center of the circle)
        assign some cost (penalty) to this selection of 4 peaks + a rear finger
        (consider, probably weighted:
             circle fitting error,
             if the rear finger is inside,
             variance in the spreading of the front fingers,
             total intensity of 5 peaks)
choose a combination of 4 peaks + a rear peak with the lowest penalty

这是一种蛮力的方法。如果 N 相对较小，那么我认为它是可行的。对于 N = 12，有 C_12 ^ 5 = 792 种组合，有 5 种方式可以选择一个后指，所以 3960 个案例来评估每个爪子。

这是图像注册问题。总体战略是：

有一个已知的例子，或者某种先前的数据。
使您的数据适合示例，或使示例适合您的数据。
如果您的数据首先大致一致，这会有所帮助。

这是一个粗略而准备好的方法 ，“可能有用的最蠢的事情”：

从大约您期望的位置开始五个脚趾坐标。
每一个，迭代地爬到山顶。即，给定当前位置，如果其值大于当前像素，则移动到最大相邻像素。当你的脚趾坐标停止移动时停止。

要抵消方向问题，您可以为基本方向（北，东北等）设置 8 个左右的初始设置。单独运行每一个并丢弃任何结果，其中两个或多个脚趾最终在同一像素。我会更多地考虑这个问题，但是这种事情仍然在图像处理中进行研究 - 没有正确的答案！

稍微复杂的想法:(加权）K 均值聚类。没那么糟糕。

从五个脚趾坐标开始，但现在这些是 “集群中心”。

然后迭代直到收敛：

将每个像素分配给最近的群集（只为每个群集创建一个列表）。
计算每个群集的质心。对于每个聚类，它是：Sum（坐标 * 强度值）/ Sum（坐标）
将每个群集移动到新的质心。

这种方法几乎可以肯定地提供更好的结果，并且您可以获得每个群集的质量，这可能有助于识别脚趾。

（同样，您已经预先指定了群集的数量。使用群集，您必须以这样或那样的方式指定密度：在这种情况下选择适合的群集数量，或者选择群集半径并查看结束的群集数量后者的一个例子是平均移位。）

对于缺乏实施细节或其他细节感到抱歉。我会对此进行编码，但我有一个截止日期。如果下周没有其他任何工作让我知道，我会试一试。