当前位置：首页 > 技术 > 正文内容

使用C++标准库实现高效数据处理

访客技术 2026年7月3日 1

1、非修改型遍历操作

此类算法仅读取数据，不改变原容器内容。

1.1 定位查找函数

find(first, last, value)：在指定范围内搜索首个与目标值匹配的元素。
find_if(first, last, predicate)：定位第一个满足条件的元素。
find_end(first, last, sub_first, sub_last)：寻找子序列最后一次出现的位置。

std::vector<int> data = {10, 25, 30, 45, 60};

// 查找数值30
auto pos = std::find(data.begin(), data.end(), 30);
if (pos != data.end()) {
    std::cout << "Found: " << *pos << '\n'; // 输出：30
}

// 找到首个大于40的元素
auto next = std::find_if(data.begin(), data.end(), [](int x) {
    return x > 40;
});
std::cout << "First >40: " << *next << '\n'; // 输出：45

// 检查子序列[25,30]是否存在
std::vector<int> pattern = {25, 30};
auto match = std::find_end(data.begin(), data.end(), pattern.begin(), pattern.end());
if (match != data.end()) {
    std::cout << "Pattern starts at index: " << (match - data.begin()) << '\n'; // 输出：1
}

1.2 统计与计数

count(first, last, value)：统计特定值出现次数。
count_if(first, last, predicate)：计算满足条件的元素数量。

std::vector<int> numbers = {2, 4, 6, 4, 8, 4};
int count_4 = std::count(numbers.begin(), numbers.end(), 4); // 结果为3
int even_count = std::count_if(numbers.begin(), numbers.end(), [](int x) {
    return x % 2 == 0;
}); // 结果为6（全部为偶数）

1.3 遍历执行

对每个元素应用指定操作

std::vector<int> values = {1, 2, 3, 4, 5};
std::for_each(values.begin(), values.end(), [](int& item) {
    item *= 3; // 每个元素乘以3
});
// 更新后：{3, 6, 9, 12, 15}

1.4 等价性检测

equal(first1, last1, first2)：判断两个范围是否完全一致。
mismatch(first1, last1, first2)：返回第一个差异位置的迭代器对。

std::vector<int> a = {1, 2, 3};
std::vector<int> b = {1, 2, 5};
bool is_same = std::equal(a.begin(), a.end(), b.begin()); // false

auto diff = std::mismatch(a.begin(), a.end(), b.begin());
if (diff.first != a.end()) {
    std::cout << "Mismatch at: " << *diff.first << " vs " << *diff.second << '\n';
}

1.5 条件验证

检查范围中所有、任意或无元素满足条件

std::vector<int> nums = {10, 20, 30, 40};
bool all_positive = std::all_of(nums.begin(), nums.end(), [](int x) {
    return x > 0;
}); // true

bool has_odd = std::any_of(nums.begin(), nums.end(), [](int x) {
    return x % 2 == 1;
}); // false

bool no_negative = std::none_of(nums.begin(), nums.end(), [](int x) {
    return x < 0;
}); // true

2、可修改型数据变换

这些操作会直接更改输入容器的内容。

2.1 数据复制与筛选

copy(first, last, dest)：将源范围复制到目标起始位置。
copy_if(first, last, dest, predicate)：仅复制符合条件的元素。

std::vector<int> source = {1, 2, 3, 4, 5};
std::vector<int> target(5);

// 复制全部元素
std::copy(source.begin(), source.end(), target.begin());

// 只复制奇数
std::vector<int> odds;
std::copy_if(source.begin(), source.end(), std::back_inserter(odds), [](int x) {
    return x % 2 == 1;
});
// odds: {1, 3, 5}

2.2 元素转换

对每个元素执行映射操作，并输出结果

std::vector<int> input = {1, 2, 3};
std::vector<int> squares(3);

// 单参数转换：平方运算
std::transform(input.begin(), input.end(), squares.begin(), [](int x) {
    return x * x;
});

// 双参数转换：两向量对应项相加
std::vector<int> a = {1, 2, 3}, b = {4, 5, 6};
std::vector<int> sum(3);
std::transform(a.begin(), a.end(), b.begin(), sum.begin(), [](int x, int y) {
    return x + y;
});

2.3 替换操作

replace(first, last, old_val, new_val)：替换所有旧值。
replace_if(first, last, predicate, new_val)：按条件替换。
replace_copy(first, last, dest, old_val, new_val)：复制并替换，保留原数据。

std::vector<int> list = {7, 8, 9, 8, 10};

// 将所有8替换为88
std::replace(list.begin(), list.end(), 8, 88);

// 将大于9的值设为0
std::replace_if(list.begin(), list.end(), [](int x) {
    return x > 9;
}, 0);

// 创建副本，替换9为900
std::vector<int> result;
std::replace_copy(list.begin(), list.end(), std::back_inserter(result), 9, 900);

2.4 删除逻辑与物理清除

remove(first, last, value)：将匹配元素移至末尾，返回新结尾。
remove_if(first, last, predicate)：基于谓词删除。
erase 必须配合使用以实际释放内存。

std::vector<int> items = {1, 2, 3, 2, 4};

// 移除所有2（逻辑删除）
auto new_end = std::remove(items.begin(), items.end(), 2);
items.erase(new_end, items.end()); // 物理清理

// 使用lambda移除偶数
items = {1, 2, 3, 4, 5};
items.erase(std::remove_if(items.begin(), items.end(), [](int x) {
    return x % 2 == 0;
}), items.end());

2.5 去重

移除连续重复项，通常需配合 erase

std::vector<int> data = {1, 1, 2, 2, 2, 3, 4, 4};
auto last = std::unique(data.begin(), data.end());
data.erase(last, data.end()); // 去重后：{1, 2, 3, 4}

2.6 反转与旋转

reverse(first, last)：反转顺序。
rotate(first, middle, last)：以中间点为轴旋转。

std::vector<int> nums = {1, 2, 3, 4, 5};
std::reverse(nums.begin(), nums.end()); // {5, 4, 3, 2, 1}

std::rotate(nums.begin(), nums.begin() + 2, nums.end()); // 以第3个元素开始旋转 → {3, 4, 5, 1, 2}

2.7 随机打乱

使用随机引擎对元素进行随机排列

#include <random>
std::vector<int> data = {1, 2, 3, 4, 5};
std::random_device rd;
std::mt19937 gen(rd());
std::shuffle(data.begin(), data.end(), gen);

3、排序与查找优化

3.1 排序策略

sort(first, last)：快速排序，不稳定。
stable_sort(first, last)：归并排序，保持相等元素顺序。
partial_sort(first, mid, last)：仅排序前部最小元素。

std::vector<int> arr = {5, 2, 8, 1, 9};
std::sort(arr.begin(), arr.end()); // 升序：{1, 2, 5, 8, 9}
std::sort(arr.begin(), arr.end(), std::greater<>()); // 降序

// 仅获取最小的前3个
std::partial_sort(arr.begin(), arr.begin() + 3, arr.end());
// 前三个为最小的三个元素，已排序

3.2 第k小元素

使第k个位置的元素处于正确排序位置

std::vector<int> vec = {5, 3, 1, 4, 2};
std::nth_element(vec.begin(), vec.begin() + 2, vec.end());
// vec[2] 是第三小的元素（值为3），左侧均≤3，右侧≥3

3.3 二分查找系列

必须作用于已排序容器

binary_search(first, last, value)：判断存在性。
lower_bound(first, last, value)：首个不小于值的位置。
upper_bound(first, last, value)：首个大于值的位置。

std::vector<int> sorted = {1, 3, 3, 5, 7};

bool found = std::binary_search(sorted.begin(), sorted.end(), 3); // true

auto low = std::lower_bound(sorted.begin(), sorted.end(), 3); // 指向第一个3
auto up = std::upper_bound(sorted.begin(), sorted.end(), 3);   // 指向第一个5

3.4 合并有序序列

合并两个已排序范围

std::vector<int> left = {1, 3, 5};
std::vector<int> right = {2, 4, 6};
std::vector<int> merged(left.size() + right.size());

std::merge(left.begin(), left.end(), right.begin(), right.end(), merged.begin());
// merged: {1, 2, 3, 4, 5, 6}

4、堆结构管理

支持最大堆操作，适用于优先队列实现

std::vector<int> heap_data = {4, 1, 3, 2, 5};
std::make_heap(heap_data.begin(), heap_data.end()); // 构建最大堆

heap_data.push_back(6);
std::push_heap(heap_data.begin(), heap_data.end()); // 插入新元素

std::pop_heap(heap_data.begin(), heap_data.end()); // 最大值移到末尾
int max = heap_data.back(); // 取出最大值
heap_data.pop_back();

std::sort_heap(heap_data.begin(), heap_data.end()); // 排序为升序

5、极值提取

5.1 单值比较

min(a, b) / max(a, b)：返回两者中的较小/较大值。
min({...}) / max({...})：支持初始化列表。

int min_val = std::min(5, 3); // 3
auto smallest = std::min({4, 1, 8, 2}); // 1

5.2 迭代器级极值

min_element(first, last)：返回最小值的迭代器。
max_element(first, last)：返回最大值的迭代器。

std::vector<int> data = {3, 1, 4, 2, 5};
auto min_it = std::min_element(data.begin(), data.end());
auto max_it = std::max_element(data.begin(), data.end());

5.3 同时获取最值

minmax_element 返回最小和最大值的迭代器对

auto range = std::minmax_element(data.begin(), data.end());
// range.first 指向1，range.second 指向5

6、数值运算工具

位于 <numeric> 头文件中。

6.1 累加求和

accumulate(first, last, init)：累加所有元素。
支持自定义操作符。

std::vector<int> nums = {1, 2, 3, 4, 5};
int total = std::accumulate(nums.begin(), nums.end(), 0); // 15
int product = std::accumulate(nums.begin(), nums.end(), 1, std::multiplies<int>()); // 120

6.2 内积计算

计算两个序列对应项乘积之和

std::vector<int> a = {1, 2, 3}, b = {4, 5, 6};
int inner = std::inner_product(a.begin(), a.end(), b.begin(), 0); // 1×4 + 2×5 + 3×6 = 32

6.3 连续赋值

用递增数值填充范围

std::vector<int> buffer(5);
std::iota(buffer.begin(), buffer.end(), 10); // {10, 11, 12, 13, 14}

6.4 累积部分和

生成前缀和数组

std::vector<int> src = {1, 2, 3, 4, 5};
std::vector<int> prefix(src.size());
std::partial_sum(src.begin(), src.end(), prefix.begin());
// prefix: {1, 3, 6, 10, 15}

6.5 相邻差分

计算相邻元素之间的差值

std::vector<int> src = {1, 3, 6, 10, 15};
std::vector<int> diffs(src.size());
std::adjacent_difference(src.begin(), src.end(), diffs.begin());
// diffs: {1, 2, 3, 4, 5}

7、生成与集合操作

7.1 动态填充

generate(first, last, generator)：用函数生成值填充范围。

std::vector<int> buffer(5);
int counter = 0;
std::generate(buffer.begin(), buffer.end(), [&]() {
    return ++counter;
});
// buffer: {1, 2, 3, 4, 5}

7.2 限定生成

generate_n(first, count, generator)：仅生成前n个元素。

std::vector<int> buf(5);
int start = 100;
std::generate_n(buf.begin(), 3, [&]() {
    return start++;
});
// 前三个为100, 101, 102，其余不变

7.3 子集包含判断

检查一个已排序序列是否包含另一个

std::vector<int> full = {1, 2, 3, 4, 5};
std::vector<int> subset = {2, 4};
bool contains = std::includes(full.begin(), full.end(), subset.begin(), subset.end()); // true

7.4 集合运算

set_union：并集。
set_intersection：交集。
set_difference：差集。
set_symmetric_difference：对称差集。

std::vector<int> v1 = {1, 2, 3, 4, 5};
std::vector<int> v2 = {3, 4, 5, 6, 7};
std::vector<int> result;

std::set_union(v1.begin(), v1.end(), v2.begin(), v2.end(), std::back_inserter(result));
// result: {1, 2, 3, 4, 5, 6, 7}

result.clear();
std::set_intersection(v1.begin(), v1.end(), v2.begin(), v2.end(), std::back_inserter(result));
// result: {3, 4, 5}

8、关键概念说明

sort 与 stable_sort 的区别？

sort 采用 introsort，效率高但不保证相等元素顺序。
stable_sort 使用归并策略，保持原有顺序，空间开销略大。

为何 remove 需搭配 erase？ remove 仅将待删元素移动至末尾，返回新的有效尾指针，但不改变容器大小。必须调用 erase 删掉冗余尾部元素，才能真正释放空间。
哪些算法要求输入有序？ 二分查找类（binary_search, lower_bound）、集合运算类（set_union 等）、merge 等依赖有序性来实现 O(log n) 或线性时间复杂度。

标签: C++STL 算法

返回列表

上一篇：基于字符模式的图形生成与矩阵变换技术

下一篇：119K Star微软开源神器，轻松搞定Markdown文档转换

Linux crontab 详解

1) crontab 是什么cron 是 Linux 的定时任务守护进程；crontab 是用来编辑/查看“按时间周期执行命令”的表（cron table）。常见两类：用户 crontab：每个用户一份（crontab -e 编辑）系统级 crontab / cron.d：可指定执行用户（/etc/crontab、/etc/cron.d/*）2) crontab 时间...

富文本里可以允许的 HTML 属性

一、所有标签默认允许的安全属性（极少）class （可选）id （通常建议禁用）title️ 注意：id 容易被滥用做锚点注入，很多系统直接禁用class 允许的话最好只允许固定前缀（如 editor-*）二、a 标签允许属性<a href="" t...

方法一：通过官网安装包（最简单，适合初学者）如果你只是想快速安装并开始使用，这是最直接的方法。访问 Node.js 官网。页面会显示两个版本：LTS (Recommended For Most Users)：长期支持版，最稳定。建议选这个。Current：最新特性版，包含最新功能但可能不够稳定。下载 .pkg 安装包并运行。按照安装向导点击“下一步”即可完成。方法二：使用 Homebrew 安装（...

Dom\HTML_NO_DEFAULT_NS 的副作用：自动加闭合标签

在使用Dom\HTMLDocument时，Dom\HTML_NO_DEFAULT_NS 将禁止在解析过程中设置元素的命名空间, 此设置是为了与DOMDocument向后兼容而存在的。当使用它时，已知的一个副作用就是：自动加闭合标签例如 </img> 为什么会这样？当你使用：Dom\HTML_NO_DEFAULT_NS文档会变成无命名空间模式，此时内部更接近 XML...

Laravel 事件和监听器创建

在 Laravel 中，使用 Artisan 命令创建 Events（事件）和 Listeners（监听器）是非常高效的。你可以通过以下几种方式来实现：1. 手动创建单个 Event如果你只想创建一个事件类，可以使用 make:event 命令：Bashphp artisan make:event UserRegistered执行后，文件将生成在 app/Even...

自定义域名解析神器 dnsmasq

什么是 dnsmasq？dnsmasq 是一个轻量级、功能强大的网络服务工具，专为小型和中等规模网络设计。它是一个综合的网络基础设施解决方案[1]。dnsmasq 能做什么？功能说明应用场景DNS 转发与缓存将 DNS 查询转发到上游服务器（ISP、Google DNS 等），并在本地缓存结果加快 DNS 查询速度，减少外部 DNS 流量本地 DNS解析本地网络设备的主机名，无需编辑&n...

老程序员博客