R語言中Rcpp基礎知識點有哪些

發布時間：2021-11-06 17:22:54 來源：億速云閱讀：168 作者：小新欄目：開發技術

這篇文章將為大家詳細講解有關R語言中Rcpp基礎知識點有哪些，小編覺得挺實用的，因此分享給大家做個參考，希望大家閱讀完這篇文章后可以有所收獲。

1. 相關配置和說明

由于Dirk的書Seamless R and C++ Integration with Rcpp是13年出版的，當時Rcpp Attributes這一特性還沒有被CRAN批準，所以當時調用和編寫Rcpp函數還比較繁瑣。Rcpp Attributes（2016）極大簡化了這一過程(“provides an even more direct connection between C++ and R”)，保留了內聯函數，并提供了sourceCpp函數用于調用外部的.cpp文件。換句話說，我們可以將某C++函數存在某個.cpp文件中，再從R腳本文件中，像使用source一樣，通過sourceCpp來調用此C++函數。

例如，在R腳本文件中，我們希望調用名叫test.cpp文件中的函數，我們可以采用如下操作：

library(Rcpp)
Sys.setenv("PKG_CXXFLAGS"="-std=c++11")
sourceCpp("test.cpp")

其中第二行的意思是使用C++11的標準來編譯文件。

在test.cpp文件中, 頭文件使用Rcpp.h，需要輸出到R中的函數放置在//[[Rcpp::export]]之后。如果要輸出到R中的函數需要調用其他C++函數，可以將這些需要調用的函數放在//[[Rcpp::export]]之前。

#include <Rcpp.h>
using namespace Rcpp;
//[[Rcpp::export]]

為進行代數計算，Rcpp提供了RcppArmadillo和RcppEigen。如果要使用此包，需要在函數文件開頭注明依賴關系，例如// [[Rcpp::depends(RcppArmadillo)]]，并載入相關頭文件：

// [[Rcpp::depends(RcppArmadillo)]]
#include <RcppArmadillo.h>
#include <Rcpp.h>
using namespace Rcpp;
using namespace arma;
// [[Rcpp::export]]

C++的基本知識可以參見此處。

2. 常用數據類型

關鍵字	描述
int/double/bool/String/auto	整數型/數值型/布爾值/字符型/自動識別(C++11)
IntegerVector	整型向量
NumericVector	數值型向量(元素的類型為double)
ComplexVector	復數向量 Not Sure
LogicalVector	邏輯型向量； R的邏輯型變量可以取三種值：TRUE, FALSE, NA；而C++布爾值只有兩個,true or false。如果將R的NA轉化為C++中的布爾值，則會返回true。
CharacterVector	字符型向量
ExpressionVector	vectors of expression types
RawVector	vectors of type raw
IntegerMatrix	整型矩陣
NumericMatrix	數值型矩陣(元素的類型為double)
LogicalMatrix	邏輯型矩陣
CharacterMatrix	字符矩陣
List aka GenericVector	列表；lists;類似于R中列表，其元素可以使任何數據類型
DataFrame	數據框；data frames；在Rcpp內部，數據框其實是通過列表實現的
Function	函數型
Environment	環境型；可用于引用R環境中的函數、其他R包中的函數、操作R環境中的變量
RObject	可以被R識別的類型

注釋：

某些R對象可以通過as<Some_RcppObject>(Some_RObject)轉化為轉化為Rcpp對象。例如:
在R中擬合一個線性模型（其為List），并將其傳入C++函數中

>mod=lm(Y~X);

NumericVector resid = as<NumericVector>(mod["residuals"]);
NumericVector fitted = as<NumericVector>(mod["fitted.values"]);

可以通過as<some_STL_vector>(Some_RcppVector)，將NumericVector轉換為std::vector。例如：

std::vector<double> vec;
vec = as<std::vector<double>>(x);

在函數中，可以用wrap()，將std::vector轉換為NumericVector。例如：

arma::vec long_vec(16,arma::fill::randn);
vector<double> long_vec2 = conv_to<vector<double>>::from(long_vec);
NumericVector output = wrap(long_vec2);

在函數返回時，可以使用wrap()，將C++ STL類型轉化為R可識別類型。示例見后面輸入和輸出示例部分。

以上數據類型除了Environment之外（Function不確定），大多可直接作為函數返回值，并被自動轉化為R對象。

算數和邏輯運算符號+, -, *, /, ++, --, pow(x,p), <, <=, >, >=, ==, !=。邏輯關系符號&&, ||, !。

3. 常用數據類型的建立

//1. Vector
NumericVector V1(n);//創立了一個長度為n的默認初始化的數值型向量V1。
NumericVector V2=NumericVector::create(1, 2, 3); //創立了一個數值型向量V2，并初始化使其含有三個數1，2，3。
LogicalVector V3=LogicalVector::create(true,false,R_NaN);//創立了一個邏輯型變量V3。如果將其轉化為R Object，則其含有三個值TRUE, FALSE, NA。
//2. Matrix
NumericMatrix M1(nrow,ncol);//創立了一個nrow*ncol的默認初始化的數值型矩陣。
//3. Multidimensional Array
NumericVector out=NumericVector(Dimension(2,2,3));//創立了一個多維數組。然而我不知道有什么卵用。。
//4. List
NumericMatrix y1(2,2);
NumericVector y2(5);
List L=List::create(Named("y1")=y1,
                    Named("y2")=y2);

//5. DataFrame
NumericVector a=NumericVector::create(1,2,3);
CharacterVector b=CharacterVector::create("a","b","c");
std::vector<std::string> c(3);
c[0]="A";c[1]="B";c[2]="C";
DataFrame DF=DataFrame::create(Named("col1")=a,
                               Named("col2")=b,
                               Named("col3")=c);

4. 常用數據類型元素訪問

元素訪問	描述
[n]	對于向量類型或者列表，訪問第n個元素。對于矩陣類型，首先把矩陣的下一列接到上一列之下，從而構成一個長列向量，并訪問第n個元素。不同于R，n從0開始。
(i,j)	對于矩陣類型，訪問第(i,j)個元素。不同于R，i和j從0開始。不同于向量，此處用圓括號。
List["name1"]/DataFrame["name2"]	訪問List中名為name1的元素/訪問DataFrame中，名為name2的列。

5. 成員函數

成員函數	描述
X.size()	返回X的長度；適用于向量或者矩陣，如果是矩陣，則先向量化
X.push_back(a)	將a添加進X的末尾；適用于向量
X.push_front(b)	將b添加進X的開頭；適用于向量
X.ncol()	返回X的列數
X.nrow()	返回X的行數

6. 語法糖

6.1 算術和邏輯運算符

+, -, *, /, pow(x,p), <, <=, >, >=, ==, !=, !

以上運算符均可向量化。

6.2. 常用函數

is.na()
Produces a logical sugar expression of the same length. Each element of the result expression evaluates to TRUE if the corresponding input is a missing value, or FALSE otherwise.

seq_len()
seq_len( 10 ) will generate an integer vector from 1 to 10 (Note: not from 0 to 9), which is very useful in conjugation withsapply() and lapply().

pmin(a,b) and pmax(a,b)
a and b are two vectors. pmin()(or pmax()) compares the i <script type="math/tex" id="MathJax-Element-1">i</script>th elements of a and b and return the smaller (larger) one.

ifelse()
ifelse( x > y, x+y, x-y ) means if x>y is true, then do the addition; otherwise do the subtraction.

sapply()
sapply applies a C++ function to each element of the given expression to create a new expression. The type of the resulting expression is deduced by the compiler from the result type of the function.

The function can be a free C++ function such as the overload generated by the template function below:

template <typename T>
T square( const T& x){
    return x * x ;
}
sapply( seq_len(10), square<int> ) ;

Alternatively, the function can be a functor whose type has a nested type called result_type

template <typename T>
struct square : std::unary_function<T,T> {
    T operator()(const T& x){
    return x * x ;
    }
}
sapply( seq_len(10), square<int>() ) ;

lappy()
lapply is similar to sapply except that the result is allways an list expression (an expression of type VECSXP).

sign()

其他函數

數學函數: abs(), acos(), asin(), atan(), beta(), ceil(), ceiling(), choose(), cos(), cosh(), digamma(), exp(), expm1(), factorial(), floor(), gamma(), lbeta(), lchoose(), lfactorial(), lgamma(), log(), log10(), log1p(), pentagamma(), psigamma(), round(), signif(), sin(), sinh(), sqrt(), tan(), tanh(), tetragamma(), trigamma(), trunc().
匯總函數: mean(), min(), max(), sum(), sd(), and (for vectors) var()
返回向量的匯總函數: cumsum(), diff(), pmin(), and pmax()
查找函數: match(), self_match(), which_max(), which_min()
重復值處理函數: duplicated(), unique()

7. STL

Rcpp可以使用C++的標準模板庫STL中的數據結構和算法。Rcpp也可以使用Boost中的數據結構和算法。

7.1. 迭代器

此處僅僅以一個例子代替，詳細參見C++ Primer，或者此處。

#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
double sum3(NumericVector x) {
  double total = 0;
  NumericVector::iterator it;
  for(it = x.begin(); it != x.end(); ++it) {
    total += *it;
  }
  return total;
}

7.2. 算法

頭文件<algorithm>中提供了許多的算法（可以和迭代器共用），具體可以參見此處。

For example, we could write a basic Rcpp version of findInterval() that takes two arguments a vector of values and a vector of breaks, and locates the bin that each x falls into.

#include <algorithm>
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
IntegerVector findInterval2(NumericVector x, NumericVector breaks) {
  IntegerVector out(x.size());
  NumericVector::iterator it, pos;
  IntegerVector::iterator out_it;
  for(it = x.begin(), out_it = out.begin(); it != x.end(); 
      ++it, ++out_it) {
    pos = std::upper_bound(breaks.begin(), breaks.end(), *it);
    *out_it = std::distance(breaks.begin(), pos);
  }
  return out;
}

7.3. 數據結構

STL所提供的數據結構也是可以使用的，Rcpp知道如何將STL的數據結構轉換成R的數據結構，所以可以從函數中直接返回他們，而不需要自己進行轉換。
具體請參考此處。

7.3.1. Vectors

詳細信息請參見處此

創建
vector<int>, vector<bool>, vector<double>, vector<String>

元素訪問
利用標準的[]符號訪問元素

元素增加
利用.push_back()增加元素。

存儲空間分配
如果事先知道向量長度，可用.reserve()分配足夠的存儲空間。

例子：

The following code implements run length encoding (rle()). It produces two vectors of output: a vector of values, and a vector lengths giving how many times each element is repeated. It works by looping through the input vector x comparing each value to the previous: if it's the same, then it increments the last value in lengths; if it's different, it adds the value to the end of values, and sets the corresponding length to 1.

#include <Rcpp.h>
using namespace Rcpp;

// [[Rcpp::export]]
List rleC(NumericVector x) {
  std::vector<int> lengths;
  std::vector<double> values;

  // Initialise first value
  int i = 0;
  double prev = x[0];
  values.push_back(prev);
  lengths.push_back(1);

  NumericVector::iterator it;
  for(it = x.begin() + 1; it != x.end(); ++it) {
    if (prev == *it) {
      lengths[i]++;
    } else {
      values.push_back(*it);
      lengths.push_back(1);

      i++;
      prev = *it;
    }
  }
  return List::create(
    _["lengths"] = lengths, 
    _["values"] = values
  );
}

7.3.2. Sets

參見鏈接1，鏈接2和鏈接3。

STL中的集合std::set不允許元素重復，而std::multiset允許元素重復。集合對于檢測重復和確定不重復的元素具有重要意義((like unique, duplicated, or in))。

Ordered set: std::set和std::multiset。

Unordered set: std::unordered_set
一般而言unordered set比較快，因為它們使用的是hash table而不是tree的方法。
unordered_set<int>, unordered_set<bool>, etc

7.3.3. Maps

與table()和match()關系密切。

Ordered map: std::map

Unordered map: std::unordered_map

Since maps have a value and a key, you need to specify both types when initialising a map:

map<double, int>, unordered_map<int, double>.

8. 與R環境的互動

通過EnvironmentRcpp可以獲取當前R全局環境(Global Environment)中的變量和載入的函數，并可以對全局環境中的變量進行修改。我們也可以通過Environment獲取其他R包中的函數，并在Rcpp中使用。

獲取其他R包中的函數

Rcpp::Environment stats("package:stats");
Rcpp::Function rnorm = stats["rnorm"];
return rnorm(10, Rcpp::Named("sd", 100.0));

獲取R全局環境中的變量并進行更改
假設R全局環境中有一個向量x=c(1,2,3)，我們希望在Rcpp中改變它的值。

Rcpp::Environment global = Rcpp::Environment::global_env();//獲取全局環境并賦值給Environment型變量global
Rcpp::NumericVector tmp = global["x"];//獲取x
tmp=pow(tmp,2);//平方
global["x"]=tmp;//將新的值賦予到全局環境中的x

獲取R全局環境中的載入的函數
假設全局環境中有R函數funR，其定義為：

x=c(1,2,3);
funR<-function(x){
  return (-x);
}

并有R變量x=c(1,2,3)。我們希望在Rcpp中調用此函數并應用在向量x上。

#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
NumericVector funC() {
  Rcpp::Environment global =
    Rcpp::Environment::global_env();
  Rcpp::Function funRinC = global["funR"];
  Rcpp::NumericVector tmp = global["x"];
  return funRinC(tmp);
}

9. 用Rcpp創建R包

見此文

利用Rcpp和RcppArmadillo創建R包

10. 輸入和輸出示例

如何傳遞數組

如果要傳遞高維數組，可以將其存為向量，并附上維數信息。有兩種方式：

通過.attr("dim")設置維數

NumericVector可以包含維數信息。數組可以用過NumericVector輸出到R中。此NumericVector可以通過.attr(“dim”)設置其維數信息。

// Dimension最多設置三個維數
output.attr("dim") = Dimension(3,4,2);
// 可以給.attr(“dim”)賦予一個向量，則可以設置超過三個維數
NumericVector dim = NumericVector::create(2,2,2,2);
output.attr("dim") = dim;

示例：

// 返回一個3*3*2數組
RObject func(){
  arma::vec long_vec(18,arma::fill::randn);
  vector<double> long_vec2 = conv_to<vector<double>>::from(long_vec);
  NumericVector output = wrap(long_vec2);
  output.attr("dim")=Dimension(3,3,2);
  return wrap(output);
}

// 返回一個2*2*2*2數組 
// 注意con_to<>::from()
RObject func(){
  arma::vec long_vec(16,arma::fill::randn);
  vector<double> long_vec2 = conv_to<vector<double>>::from(long_vec);
  NumericVector output = wrap(long_vec2);
  NumericVector dim = NumericVector::create(2,2,2,2);
  output.attr("dim")=dim;
  return wrap(output);
}

另外建立一個向量存維數，在R中再通過.attr("dim")設置維數

函數返回一維STL vector

自動轉化為R中的向量

vector<double> func(NumericVector x){
  vector<double> vec;
  vec = as<vector<double>>(x);
  return vec;
}
NumericVector func(NumericVector x){
  vector<double> vec;
  vec = as<vector<double>>(x);
  return wrap(vec);
}
RObject func(NumericVector x){
  vector<double> vec;
  vec = as<vector<double>>(x);
  return wrap(vec);
}

函數返回二維STL vector

自動轉化為R中的list，list中的每個元素是一個vector。

vector<vector<double>> func(NumericVector x) {
  vector<vector<double>> mat;
  for (int i=0;i!=3;++i){
    mat.push_back(as<vector<double>>(x));
  }
  return mat;
}
RObject func(NumericVector x) {
  vector<vector<double>> mat;
  for (int i=0;i!=3;++i){
    mat.push_back(as<vector<double> >(x));
  }
  return wrap(mat);
}

返回Armadillo matrix, Cube 或 field

自動轉化為R中的matrix

NumericMatrix func(){
  arma::mat A(3,4,arma::fill::randu);
  return wrap(A);
}
arma::mat func(){
  arma::mat A(3,4,arma::fill::randu);
  return A;
}

自動轉化為R中的三維array

arma::cube func(){
  arma::cube A(3,4,5,arma::fill::randu);
  return A;
}
RObject func(){
  arma::cube A(3,4,5,arma::fill::randu);
  return wrap(A);
}

自動轉化為R list，每個元素存儲一個R向量，但此向量有維數信息（通過.Internal(inspect())查詢）。

RObject func() {
  arma::cube A(3,4,2,arma::fill::randu);
  arma::cube B(3,4,2,arma::fill::randu);
  arma::field <arma::cube> F(2,1);
  F(0)=A;
  F(1)=B;
  return wrap(F);
}

關于“R語言中Rcpp基礎知識點有哪些”這篇文章就分享到這里了，希望以上內容可以對大家有一定的幫助，使各位可以學到更多知識，如果覺得文章不錯，請把它分享出去讓更多的人看到。

向AI問一下細節

中文字幕av专区_日韩电影在线播放_精品国产精品久久一区免费式_av在线免费观看网站

R語言中Rcpp基礎知識點有哪些

1. 相關配置和說明

2. 常用數據類型

3. 常用數據類型的建立

4. 常用數據類型元素訪問

5. 成員函數

6. 語法糖

6.1 算術和邏輯運算符

6.2. 常用函數

7. STL

7.1. 迭代器

7.2. 算法

7.3. 數據結構

7.3.1. Vectors

7.3.2. Sets

7.3.3. Maps

8. 與R環境的互動

9. 用Rcpp創建R包

10. 輸入和輸出示例

如何傳遞數組

通過.attr("dim")設置維數

函數返回一維STL vector

函數返回二維STL vector

返回Armadillo matrix, Cube 或 field

猜你喜歡

中文字幕av专区_日韩电影在线播放_精品国产精品久久一区免费式_av在线免费观看网站

R語言中Rcpp基礎知識點有哪些

1. 相關配置和說明

2. 常用數據類型

3. 常用數據類型的建立

4. 常用數據類型元素訪問

5. 成員函數

6. 語法糖

6.1 算術和邏輯運算符

6.2. 常用函數

7. STL

7.1. 迭代器

7.2. 算法

7.3. 數據結構

7.3.1. Vectors

7.3.2. Sets

7.3.3. Maps

8. 與R環境的互動

9. 用Rcpp創建R包

10. 輸入和輸出示例

如何傳遞數組

通過.attr("dim")設置維數

函數返回一維STL vector

函數返回二維STL vector

返回Armadillo matrix, Cube 或 field

猜你喜歡

最新資訊

相關推薦

相關標簽