注册 登录  
 加关注
   显示下一条  |  关闭
温馨提示!由于新浪微博认证机制调整,您的新浪微博帐号绑定已过期,请重新绑定!立即重新绑定新浪微博》  |  关闭

Tsecer的回音岛

Tsecer的博客

 
 
 

日志

 
 

stl::multimap的equal-range实现原理  

2017-06-01 21:11:32|  分类: C/C++基础 |  标签: |举报 |字号 订阅

  下载LOFTER 我的照片书  |
一、lower_bound、upper_bound及equal_range
C++标准中对于 lower_bound、upper_bound、equal_range的说明
25.3.3.1 lower_bound
template<class ForwardIterator, class T>
        ForwardIterator
        lower_bound(ForwardIterator first, ForwardIterator last,
                        const T& value);
template<class ForwardIterator, class T, class Compare>
        ForwardIterator
        lower_bound(ForwardIterator first, ForwardIterator last,
                const T& value, Compare comp);
Effects: Finds the first position into which value can be inserted without violating the ordering.
Returns: The furthermost iterator i in the range [ first, last) such that for any iterator j in the
        range [ first, i) the following corresponding conditions hold: *j < value or comp(*j,
        value) == true
Complexity: At most log( last - first) + 1 comparisons.

25.3.3.2 upper_bound
template<class ForwardIterator, class T>
ForwardIterator
        upper_bound(ForwardIterator first, ForwardIterator last,
                const T& value);
template<class ForwardIterator, class T, class Compare>
ForwardIterator
        upper_bound(ForwardIterator first, ForwardIterator last,
                const T& value, Compare comp);
Effects: Finds the furthermost position into which value can be inserted without violating the ordering.
Returns: The furthermost iterator i in the range [ first, last) such that for any iterator j in the
        range [ first, i) the following corresponding conditions hold: !(value < *j) or
        comp( value, *j) == false
Complexity: At most log( last - first) + 1 comparisons.
25.3.3.3 equal_range
template<class ForwardIterator, class T>
        pair<ForwardIterator, ForwardIterator>
                equal_range(ForwardIterator first,
                                ForwardIterator last, const T& value);
template<class ForwardIterator, class T, class Compare>
        pair<ForwardIterator, ForwardIterator>
                equal_range(ForwardIterator first,
                        ForwardIterator last, const T& value,
                        Compare comp);
Effects: Finds the largest subrange [i, j) such that the value can be inserted at any iterator k in it. k
        satisfies the corresponding conditions: !(*k < value) && !(value < *k) or comp(*k,
        value) == false && comp( value, *k) == false.
Complexity: At most 2 * log( last - first) + 1 comparisons.
二、最直观的有序数组上实现lower_bound的方法
假设说有一个已经排好序的数组,并且数组中可能有重复元素,在这个数组中获得其中的lower_bound,最直观的方法就是找到数组中第一个元素i,该元素满足 arr[i] >= k并且 arr[i - 1] < k,也就是在数组中,元素i + 1和其前一个元素i - 1之间对k的小于关系是产生一个跃迁,用epoll的说法来说,就是“边缘触发”(edge-triggered)。
下面是实例代码:
tsecer@harry: cat std_equal_range.cpp 
#include <stdio.h>
int arr[] = {1, 2 , 3, 4, 5, 5, 5, 6, 7, 8};
#define ARRLEN int(sizeof(arr)/sizeof(arr[0]))
#define KEY    5
int main()
{
        for (int i  = 0; i < ARRLEN; i++)
        {
                if (arr[i] < KEY && (i + 1 < ARRLEN) && arr[i + 1] >= KEY)
                {
                        printf("index %d prev %d next %d\n", i + 1, arr[i], arr[i + 1]);
                        return 0;
                }
        }
        return -1;
}
tsecer@harry: g++ std_equal_range.cpp 
tsecer@harry: ./a.out 
index 4 prev 4 next 5
tsecer@harry: 
三、提高一点效率
由于数组有序,所以一个最直观的优化是使用二分查找,这样可以满足文档中所说的log级别查找效率。
实现的思路是 每次对于中间节点的值和键值key比较,如果满足 >= key,则记录为一个潜在的可能满足者(candidate),再查找看是否有比这个更小的、并且满足 >= key的键值,如果没有则返回最后一次记录的键值:
tsecer@harry: cat std_equal_range_binary.cpp
#include <stdio.h>
#include <stdlib.h>

int arr[] = {1, 2 , 3, 4, 5, 5, 5, 6, 7, 8};
#define ARRLEN int(sizeof(arr)/sizeof(arr[0]))

int main(int argc, char *argv[])
{
int KEY = atoi(argv[1]);
int icand = -1;
int ilower = 0, iupper = ARRLEN - 1, imid = (ilower + iupper) / 2;
while (ilower <= iupper)
{
if (arr[imid] >= KEY)
{
icand = imid;
iupper  = imid - 1;
}
else
{
ilower = imid + 1;
}
imid = (ilower + iupper) / 2;
}
if (icand >= 0)
{
printf("lower_bound index %d key %d\n", icand, KEY);
}
else
{
printf("no lower_bound\n");
}
        return -1;
}
tsecer@harry: g++ std_equal_range_binary.cpp -o std_equal_range_binary
tsecer@harry: ./std_equal_range_binary 5
lower_bound index 4 key 5
tsecer@harry: ./std_equal_range_binary 10
no lower_bound
tsecer@harry: ./std_equal_range_binary -1
lower_bound index 0 key -1
tsecer@harry: 
四、gcc中stl库对于该功能的实现
stl对于map、multimap的内部实现使用的都是红黑树结构,所以说内部的节点天然就是有一个类似于二分查找的结构,这个二分的节点就是根基点,左子节点是左边的二分中点,右子节点是右边的二分中点,每次二分查找的时候只用遍历左右节点就可以和前面的二分方法类似。下面是它对于map的lower_bound的实现:

  template<typename _Key, typename _Val, typename _KeyOfValue,
           typename _Compare, typename _Alloc>
    typename _Rb_tree<_Key, _Val, _KeyOfValue, _Compare, _Alloc>::iterator
    _Rb_tree<_Key, _Val, _KeyOfValue, _Compare, _Alloc>::
    lower_bound(const _Key& __k)
    {
      _Link_type __x = _M_begin(); // Current node.
      _Link_type __y = _M_end(); // Last node which is not less than __k.

      while (__x != 0)
if (!_M_impl._M_key_compare(_S_key(__x), __k))
 __y = __x, __x = _S_left(__x);
else
 __x = _S_right(__x);

      return iterator(__y);
    }
由于stl::multimap对于模板参数的定义中,只有less比较符,所以    if (arr[imid] >= KEY) 只能通过转换后对等的 !compare(arr[imid] < k) 表示,并且x的初始值_M_begin()也就是二叉树的中点,而left和right分别对应左侧和右侧二分节点的中点。也就是说这方法本质上和前面的数组二分查找是相同的。
upper_bound的计算方法与此类似,只是在记录候选项的时候判断条件是 if (arr[i] > KEY) 或者 if (KEY < arr[i])的时候记录。
五、stl的_Rb_tree内部结构
其主题实现在该模板类中定义,其中包含了一个 _Rb_tree_node_base类型的_M_header对象,这个是一个meta结构,也就是即使整个map中没有元素它也是存在的,在树上有元素的时候,它还可以被map中的元素指向,作为整个map的end()标识,因为map本质上是一个链表,链表并不能像数组一样使用 起始地址 + 长度 来定界一个终点,最好是指向一个具体的内存地址,而这个地址就是这个结构中的_M_header。并且它的_M_parent指向的是整个二叉树的中点,_M_left指向有序链表的开始(begin()使用),而_M_right则指向最后一个合法节点(_Rb_tree_decrement     if (__x->_M_color == _S_red  && __x->_M_parent->_M_parent == __x) __x = __x->_M_right;)
      // Specialization for _Comparison types that are not capable of
      // being base classes / super classes.
      template<typename _Key_compare>
        struct _Rb_tree_impl<_Key_compare, true> : public _Node_allocator 
{
 _Key_compare _M_key_compare;
 _Rb_tree_node_base _M_header;
 size_type _M_node_count; // Keeps track of size of tree.

 _Rb_tree_impl(const _Node_allocator& __a = _Node_allocator(),
const _Key_compare& __comp = _Key_compare())
 : _Node_allocator(__a), _M_key_compare(__comp), _M_header(),
   _M_node_count(0)
 { 
   this->_M_header._M_color = _S_red;
   this->_M_header._M_parent = 0;
   this->_M_header._M_left = &this->_M_header;
   this->_M_header._M_right = &this->_M_header;
 }
}
六、gdb打印一个map的内部结构
关于map的打印命令,该文章的附件中包含了常见stl容器的gdb打印函数,下面是关于map的摘录:
#
# std::map and std::multimap
#

define pmap
if $argc == 0
help pmap
else
set $tree = $arg0
set $i = 0
set $node = $tree->_M_t->_M_impl->_M_header->_M_left
set $end = $tree->_M_t->_M_impl->_M_header
set $tree_size = $tree->_M_t->_M_impl->_M_node_count
if $argc == 1
printf "Map "
whatis $tree
printf "Use pmap <variable_name> <left_element_type> <right_element_type> to see the elements in the map.\n"
end
if $argc == 3
while $i < $tree_size
set $value = (void *)($node + 1)
printf "elem[%u]->left: ", $i
p *($arg1*)$value
set $value = $value + 4
printf "elem[%u]->right: ", $i
p *($arg2*)$value
if $node->_M_right != 0
set $node = $node->_M_right
while $node->_M_left != 0
set $node = $node->_M_left
end
else
set $tmp_node = $node->_M_parent
while $node == $tmp_node->_M_right
set $node = $tmp_node
set $tmp_node = $tmp_node->_M_parent
end
if $node->_M_right != $tmp_node
set $node = $tmp_node
end
end
set $i++
end
end
if $argc == 4
set $idx = $arg3
set $ElementsFound = 0
while $i < $tree_size
set $value = (void *)($node + 1)
if *($arg1*)$value == $idx
printf "elem[%u]->left: ", $i
p *($arg1*)$value
set $value = $value + 4
printf "elem[%u]->right: ", $i
p *($arg2*)$value
set $ElementsFound++
end
if $node->_M_right != 0
set $node = $node->_M_right
while $node->_M_left != 0
set $node = $node->_M_left
end
else
set $tmp_node = $node->_M_parent
while $node == $tmp_node->_M_right
set $node = $tmp_node
set $tmp_node = $tmp_node->_M_parent
end
if $node->_M_right != $tmp_node
set $node = $tmp_node
end
end
set $i++
end
printf "Number of elements found = %u\n", $ElementsFound
end
if $argc == 5
set $idx1 = $arg3
set $idx2 = $arg4
set $ElementsFound = 0
while $i < $tree_size
set $value = (void *)($node + 1)
set $valueLeft = *($arg1*)$value
set $valueRight = *($arg2*)($value + 4)
if $valueLeft == $idx1 && $valueRight == $idx2
printf "elem[%u]->left: ", $i
p $valueLeft
printf "elem[%u]->right: ", $i
p $valueRight
set $ElementsFound++
end
if $node->_M_right != 0
set $node = $node->_M_right
while $node->_M_left != 0
set $node = $node->_M_left
end
else
set $tmp_node = $node->_M_parent
while $node == $tmp_node->_M_right
set $node = $tmp_node
set $tmp_node = $tmp_node->_M_parent
end
if $node->_M_right != $tmp_node
set $node = $tmp_node
end
end
set $i++
end
printf "Number of elements found = %u\n", $ElementsFound
end
printf "Map size = %u\n", $tree_size
end
end

document pmap
Prints std::map<TLeft and TRight> or std::multimap<TLeft and TRight> information. Works for std::multimap as well.
Syntax: pmap <map> <TtypeLeft> <TypeRight> <valLeft> <valRight>: Prints map size, if T defined all elements or just element(s) with val(s)
Examples:
pmap m - prints map size and definition
pmap m int int - prints all elements and map size
pmap m int int 20 - prints the element(s) with left-value = 20 (if any) and map size
pmap m int int 20 200 - prints the element(s) with left-value = 20 and right-value = 200 (if any) and map size
end        

  评论这张
 
阅读(52)| 评论(0)
推荐 转载

历史上的今天

评论

<#--最新日志,群博日志--> <#--推荐日志--> <#--引用记录--> <#--博主推荐--> <#--随机阅读--> <#--首页推荐--> <#--历史上的今天--> <#--被推荐日志--> <#--上一篇,下一篇--> <#-- 热度 --> <#-- 网易新闻广告 --> <#--右边模块结构--> <#--评论模块结构--> <#--引用模块结构--> <#--博主发起的投票-->
 
 
 
 
 
 
 
 
 
 
 
 
 
 

页脚

网易公司版权所有 ©1997-2017