您好,登錄后才能下訂單哦!
本篇內容主要講解“怎么使用PostgreSQL的tuplesort_performsort函數”,感興趣的朋友不妨來看看。本文介紹的方法操作簡單快捷,實用性強。下面就讓小編來帶大家學習“怎么使用PostgreSQL的tuplesort_performsort函數”吧!
TupleTableSlot
執行器在”tuple table”中存儲元組,這個表是各自獨立的TupleTableSlots鏈表.
/*---------- * The executor stores tuples in a "tuple table" which is a List of * independent TupleTableSlots. There are several cases we need to handle: * 1. physical tuple in a disk buffer page * 2. physical tuple constructed in palloc'ed memory * 3. "minimal" physical tuple constructed in palloc'ed memory * 4. "virtual" tuple consisting of Datum/isnull arrays * 執行器在"tuple table"中存儲元組,這個表是各自獨立的TupleTableSlots鏈表. * 有以下情況需要處理: * 1. 磁盤緩存頁中的物理元組 * 2. 在已分配內存中構造的物理元組 * 3. 在已分配內存中構造的"minimal"物理元組 * 4. 含有Datum/isnull數組的"virtual"虛擬元組 * * The first two cases are similar in that they both deal with "materialized" * tuples, but resource management is different. For a tuple in a disk page * we need to hold a pin on the buffer until the TupleTableSlot's reference * to the tuple is dropped; while for a palloc'd tuple we usually want the * tuple pfree'd when the TupleTableSlot's reference is dropped. * 最上面2種情況跟"物化"元組的處理方式類似,但資源管理是不同的. * 對于在磁盤頁中的元組,需要pin在緩存中直至TupleTableSlot依賴的元組被清除, * 而對于通過palloc分配的元組在TupleTableSlot依賴被清除后通常希望使用pfree釋放 * * A "minimal" tuple is handled similarly to a palloc'd regular tuple. * At present, minimal tuples never are stored in buffers, so there is no * parallel to case 1. Note that a minimal tuple has no "system columns". * (Actually, it could have an OID, but we have no need to access the OID.) * "minimal"元組與通常的palloc分配的元組處理類似. * 截止目前為止,"minimal"元組不會存儲在緩存中,因此對于第一種情況不會存在并行的問題. * 注意"minimal"沒有"system columns"系統列 * (實際上,可以有OID,但不需要訪問OID列) * * A "virtual" tuple is an optimization used to minimize physical data * copying in a nest of plan nodes. Any pass-by-reference Datums in the * tuple point to storage that is not directly associated with the * TupleTableSlot; generally they will point to part of a tuple stored in * a lower plan node's output TupleTableSlot, or to a function result * constructed in a plan node's per-tuple econtext. It is the responsibility * of the generating plan node to be sure these resources are not released * for as long as the virtual tuple needs to be valid. We only use virtual * tuples in the result slots of plan nodes --- tuples to be copied anywhere * else need to be "materialized" into physical tuples. Note also that a * virtual tuple does not have any "system columns". * "virtual"元組是用于在嵌套計劃節點中拷貝時最小化物理數據的優化. * 所有通過引用傳遞指向與TupleTableSlot非直接相關的存儲的元組的Datums使用, * 通常它們會指向存儲在低層節點輸出的TupleTableSlot中的元組的一部分, * 或者指向在計劃節點的per-tuple內存上下文econtext中構造的函數結果. * 產生計劃節點的時候有責任確保這些資源未被釋放,確保virtual元組是有效的. * 我們使用計劃節點中的結果slots中的虛擬元組 --- 元組會拷貝到其他地方需要"物化"到物理元組中. * 注意virtual元組不需要有"system columns" * * It is also possible for a TupleTableSlot to hold both physical and minimal * copies of a tuple. This is done when the slot is requested to provide * the format other than the one it currently holds. (Originally we attempted * to handle such requests by replacing one format with the other, but that * had the fatal defect of invalidating any pass-by-reference Datums pointing * into the existing slot contents.) Both copies must contain identical data * payloads when this is the case. * TupleTableSlot包含物理和minimal元組拷貝是可能的. * 在slot需要提供格式化而不是當前持有的格式時會出現這種情況. * (原始的情況是我們準備通過另外一種格式進行替換來處理這種請求,但在校驗引用傳遞Datums時會出現致命錯誤) * 同時在這種情況下,拷貝必須含有唯一的數據payloads. * * The Datum/isnull arrays of a TupleTableSlot serve double duty. When the * slot contains a virtual tuple, they are the authoritative data. When the * slot contains a physical tuple, the arrays contain data extracted from * the tuple. (In this state, any pass-by-reference Datums point into * the physical tuple.) The extracted information is built "lazily", * ie, only as needed. This serves to avoid repeated extraction of data * from the physical tuple. * TupleTableSlot中的Datum/isnull數組有雙重職責. * 在slot包含虛擬元組時,它們是authoritative(權威)數據. * 在slot包含物理元組時,時包含從元組中提取的數據的數組. * (在這種情況下,所有通過引用傳遞的Datums指向物理元組) * 提取的信息通過'lazily'在需要的時候才構建. * 這樣可以避免從物理元組的重復數據提取. * * A TupleTableSlot can also be "empty", holding no valid data. This is * the only valid state for a freshly-created slot that has not yet had a * tuple descriptor assigned to it. In this state, tts_isempty must be * true, tts_shouldFree false, tts_tuple NULL, tts_buffer InvalidBuffer, * and tts_nvalid zero. * TupleTableSlot可能為"empty",沒有有效數據. * 對于新鮮創建仍未分配描述的的slot來說這是唯一有效的狀態. * 在這種狀態下,tts_isempty必須為T,tts_shouldFree為F, tts_tuple為NULL, * tts_buffer為InvalidBuffer,tts_nvalid為0. * * The tupleDescriptor is simply referenced, not copied, by the TupleTableSlot * code. The caller of ExecSetSlotDescriptor() is responsible for providing * a descriptor that will live as long as the slot does. (Typically, both * slots and descriptors are in per-query memory and are freed by memory * context deallocation at query end; so it's not worth providing any extra * mechanism to do more. However, the slot will increment the tupdesc * reference count if a reference-counted tupdesc is supplied.) * tupleDescriptor只是簡單的引用并沒有通過TupleTableSlot中的代碼進行拷貝. * ExecSetSlotDescriptor()的調用者有責任提供與slot生命周期一樣的描述符. * (典型的,不管是slots還是描述符會在per-query內存中, * 并且會在查詢結束時通過內存上下文的析構器釋放,因此不需要提供額外的機制來處理. * 但是,如果使用了引用計數型tupdesc,slot會增加tupdesc引用計數) * * When tts_shouldFree is true, the physical tuple is "owned" by the slot * and should be freed when the slot's reference to the tuple is dropped. * 在tts_shouldFree為T的情況下,物理元組由slot持有,并且在slot引用元組被清除時釋放內存. * * If tts_buffer is not InvalidBuffer, then the slot is holding a pin * on the indicated buffer page; drop the pin when we release the * slot's reference to that buffer. (tts_shouldFree should always be * false in such a case, since presumably tts_tuple is pointing at the * buffer page.) * 如tts_buffer不是InvalidBuffer,那么slot持有緩存頁中的pin,在釋放引用該buffer的slot時會清除該pin. * (tts_shouldFree通常來說應為F,因為tts_tuple會指向緩存頁) * * tts_nvalid indicates the number of valid columns in the tts_values/isnull * arrays. When the slot is holding a "virtual" tuple this must be equal * to the descriptor's natts. When the slot is holding a physical tuple * this is equal to the number of columns we have extracted (we always * extract columns from left to right, so there are no holes). * tts_nvalid指示了tts_values/isnull數組中的有效列數. * 如果slot含有虛擬元組,該字段必須跟描述符的natts一樣. * 在slot含有物理元組時,該字段等于我們提取的列數. * (我們通常從左到右提取列,因此不會有空洞存在) * * tts_values/tts_isnull are allocated when a descriptor is assigned to the * slot; they are of length equal to the descriptor's natts. * 在描述符分配給slot時tts_values/tts_isnull會被分配內存,長度與描述符natts長度一樣. * * tts_mintuple must always be NULL if the slot does not hold a "minimal" * tuple. When it does, tts_mintuple points to the actual MinimalTupleData * object (the thing to be pfree'd if tts_shouldFreeMin is true). If the slot * has only a minimal and not also a regular physical tuple, then tts_tuple * points at tts_minhdr and the fields of that struct are set correctly * for access to the minimal tuple; in particular, tts_minhdr.t_data points * MINIMAL_TUPLE_OFFSET bytes before tts_mintuple. This allows column * extraction to treat the case identically to regular physical tuples. * 如果slot沒有包含minimal元組,tts_mintuple通常必須為NULL. * 如含有,則tts_mintuple執行實際的MinimalTupleData對象(如tts_shouldFreeMin為T,則需要通過pfree釋放內存). * 如果slot只有一個minimal而沒有通常的物理元組,那么tts_tuple指向tts_minhdr, * 結構體的其他字段會被正確的設置為用于訪問minimal元組. * 特別的, tts_minhdr.t_data指向tts_mintuple前的MINIMAL_TUPLE_OFFSET字節. * 這可以讓列提取可以獨立處理通常的物理元組. * * tts_slow/tts_off are saved state for slot_deform_tuple, and should not * be touched by any other code. * tts_slow/tts_off用于存儲slot_deform_tuple狀態,不應通過其他代碼修改. *---------- */ typedef struct TupleTableSlot { NodeTag type;//Node標記 //如slot為空,則為T bool tts_isempty; /* true = slot is empty */ //是否需要pfree tts_tuple? bool tts_shouldFree; /* should pfree tts_tuple? */ //是否需要pfree tts_mintuple? bool tts_shouldFreeMin; /* should pfree tts_mintuple? */ #define FIELDNO_TUPLETABLESLOT_SLOW 4 //為slot_deform_tuple存儲狀態? bool tts_slow; /* saved state for slot_deform_tuple */ #define FIELDNO_TUPLETABLESLOT_TUPLE 5 //物理元組,如為虛擬元組則為NULL HeapTuple tts_tuple; /* physical tuple, or NULL if virtual */ #define FIELDNO_TUPLETABLESLOT_TUPLEDESCRIPTOR 6 //slot中的元組描述符 TupleDesc tts_tupleDescriptor; /* slot's tuple descriptor */ //slot所在的上下文 MemoryContext tts_mcxt; /* slot itself is in this context */ //元組緩存,如無則為InvalidBuffer Buffer tts_buffer; /* tuple's buffer, or InvalidBuffer */ #define FIELDNO_TUPLETABLESLOT_NVALID 9 //tts_values中的有效值 int tts_nvalid; /* # of valid values in tts_values */ #define FIELDNO_TUPLETABLESLOT_VALUES 10 //當前每個屬性的值 Datum *tts_values; /* current per-attribute values */ #define FIELDNO_TUPLETABLESLOT_ISNULL 11 //isnull數組 bool *tts_isnull; /* current per-attribute isnull flags */ //minimal元組,如無則為NULL MinimalTuple tts_mintuple; /* minimal tuple, or NULL if none */ //在minimal情況下的工作空間 HeapTupleData tts_minhdr; /* workspace for minimal-tuple-only case */ #define FIELDNO_TUPLETABLESLOT_OFF 14 //slot_deform_tuple的存儲狀態 uint32 tts_off; /* saved state for slot_deform_tuple */ //不能被變更的描述符(固定描述符) bool tts_fixedTupleDescriptor; /* descriptor can't be changed */ } TupleTableSlot; /* base tuple table slot type */ typedef struct TupleTableSlot { NodeTag type;//Node標記 #define FIELDNO_TUPLETABLESLOT_FLAGS 1 uint16 tts_flags; /* 布爾狀態;Boolean states */ #define FIELDNO_TUPLETABLESLOT_NVALID 2 AttrNumber tts_nvalid; /* 在tts_values中有多少有效的values;# of valid values in tts_values */ const TupleTableSlotOps *const tts_ops; /* slot的實際實現;implementation of slot */ #define FIELDNO_TUPLETABLESLOT_TUPLEDESCRIPTOR 4 TupleDesc tts_tupleDescriptor; /* slot的元組描述符;slot's tuple descriptor */ #define FIELDNO_TUPLETABLESLOT_VALUES 5 Datum *tts_values; /* 當前屬性值;current per-attribute values */ #define FIELDNO_TUPLETABLESLOT_ISNULL 6 bool *tts_isnull; /* 當前屬性isnull標記;current per-attribute isnull flags */ MemoryContext tts_mcxt; /*內存上下文; slot itself is in this context */ } TupleTableSlot; /* routines for a TupleTableSlot implementation */ //TupleTableSlot的"小程序" struct TupleTableSlotOps { /* Minimum size of the slot */ //slot的最小化大小 size_t base_slot_size; /* Initialization. */ //初始化方法 void (*init)(TupleTableSlot *slot); /* Destruction. */ //析構方法 void (*release)(TupleTableSlot *slot); /* * Clear the contents of the slot. Only the contents are expected to be * cleared and not the tuple descriptor. Typically an implementation of * this callback should free the memory allocated for the tuple contained * in the slot. * 清除slot中的內容。 * 只希望清除內容,而不希望清除元組描述符。 * 通常,這個回調的實現應該釋放為slot中包含的元組分配的內存。 */ void (*clear)(TupleTableSlot *slot); /* * Fill up first natts entries of tts_values and tts_isnull arrays with * values from the tuple contained in the slot. The function may be called * with natts more than the number of attributes available in the tuple, * in which case it should set tts_nvalid to the number of returned * columns. * 用slot中包含的元組的值填充tts_values和tts_isnull數組的第一個natts條目。 * 在調用該函數時,natts可能多于元組中可用屬性的數量,在這種情況下, * 應該將tts_nvalid設置為返回列的數量。 */ void (*getsomeattrs)(TupleTableSlot *slot, int natts); /* * Returns value of the given system attribute as a datum and sets isnull * to false, if it's not NULL. Throws an error if the slot type does not * support system attributes. * 將給定系統屬性的值作為基準返回,如果不為NULL, * 則將isnull設置為false。如果slot類型不支持系統屬性,則引發錯誤。 */ Datum (*getsysattr)(TupleTableSlot *slot, int attnum, bool *isnull); /* * Make the contents of the slot solely depend on the slot, and not on * underlying resources (like another memory context, buffers, etc). * 使slot的內容完全依賴于slot,而不是底層資源(如另一個內存上下文、緩沖區等)。 */ void (*materialize)(TupleTableSlot *slot); /* * Copy the contents of the source slot into the destination slot's own * context. Invoked using callback of the destination slot. * 將源slot的內容復制到目標slot自己的上下文中。 * 使用目標slot的回調函數調用。 */ void (*copyslot) (TupleTableSlot *dstslot, TupleTableSlot *srcslot); /* * Return a heap tuple "owned" by the slot. It is slot's responsibility to * free the memory consumed by the heap tuple. If the slot can not "own" a * heap tuple, it should not implement this callback and should set it as * NULL. * 返回slot“擁有”的堆元組。 * slot負責釋放堆元組分配的內存。 * 如果slot不能“擁有”堆元組,它不應該實現這個回調函數,應該將它設置為NULL。 */ HeapTuple (*get_heap_tuple)(TupleTableSlot *slot); /* * Return a minimal tuple "owned" by the slot. It is slot's responsibility * to free the memory consumed by the minimal tuple. If the slot can not * "own" a minimal tuple, it should not implement this callback and should * set it as NULL. * 返回slot“擁有”的最小元組。 * slot負責釋放最小元組分配的內存。 * 如果slot不能“擁有”最小元組,它不應該實現這個回調函數,應該將它設置為NULL。 */ MinimalTuple (*get_minimal_tuple)(TupleTableSlot *slot); /* * Return a copy of heap tuple representing the contents of the slot. The * copy needs to be palloc'd in the current memory context. The slot * itself is expected to remain unaffected. It is *not* expected to have * meaningful "system columns" in the copy. The copy is not be "owned" by * the slot i.e. the caller has to take responsibilty to free memory * consumed by the slot. * 返回表示slot內容的堆元組副本。 * 需要在當前內存上下文中對副本進行內存分配palloc。 * 預計slot本身不會受到影響。 * 它不希望在副本中有有意義的“系統列”。副本不是slot“擁有”的,即調用方必須負責釋放slot消耗的內存。 */ HeapTuple (*copy_heap_tuple)(TupleTableSlot *slot); /* * Return a copy of minimal tuple representing the contents of the slot. The * copy needs to be palloc'd in the current memory context. The slot * itself is expected to remain unaffected. It is *not* expected to have * meaningful "system columns" in the copy. The copy is not be "owned" by * the slot i.e. the caller has to take responsibilty to free memory * consumed by the slot. * 返回表示slot內容的最小元組的副本。 * 需要在當前內存上下文中對副本進行palloc。 * 預計slot本身不會受到影響。 * 它不希望在副本中有有意義的“系統列”。副本不是slot“擁有”的,即調用方必須負責釋放slot消耗的內存。 */ MinimalTuple (*copy_minimal_tuple)(TupleTableSlot *slot); }; typedef struct tupleDesc { int natts; /* tuple中的屬性數量;number of attributes in the tuple */ Oid tdtypeid; /* tuple類型的組合類型ID;composite type ID for tuple type */ int32 tdtypmod; /* tuple類型的typmode;typmod for tuple type */ int tdrefcount; /* 依賴計數,如為-1,則沒有依賴;reference count, or -1 if not counting */ TupleConstr *constr; /* 約束,如無則為NULL;constraints, or NULL if none */ /* attrs[N] is the description of Attribute Number N+1 */ //attrs[N]是第N+1個屬性的描述符 FormData_pg_attribute attrs[FLEXIBLE_ARRAY_MEMBER]; } *TupleDesc;
SortState
排序運行期狀態信息
/* ---------------- * SortState information * 排序運行期狀態信息 * ---------------- */ typedef struct SortState { //基類 ScanState ss; /* its first field is NodeTag */ //是否需要隨機訪問排序輸出? bool randomAccess; /* need random access to sort output? */ //結果集是否存在邊界? bool bounded; /* is the result set bounded? */ //如存在邊界,需要多少個元組? int64 bound; /* if bounded, how many tuples are needed */ //是否已完成排序? bool sort_Done; /* sort completed yet? */ //是否使用有界值? bool bounded_Done; /* value of bounded we did the sort with */ //使用的有界值? int64 bound_Done; /* value of bound we did the sort with */ //tuplesort.c的私有狀態 void *tuplesortstate; /* private state of tuplesort.c */ //是否worker? bool am_worker; /* are we a worker? */ //每個worker對應一個條目 SharedSortInfo *shared_info; /* one entry per worker */ } SortState; /* ---------------- * Shared memory container for per-worker sort information * per-worker排序信息的共享內存容器 * ---------------- */ typedef struct SharedSortInfo { //worker個數? int num_workers; //排序機制 TuplesortInstrumentation sinstrument[FLEXIBLE_ARRAY_MEMBER]; } SharedSortInfo;
TuplesortInstrumentation
報告排序統計的數據結構.
/* * Data structures for reporting sort statistics. Note that * TuplesortInstrumentation can't contain any pointers because we * sometimes put it in shared memory. * 報告排序統計的數據結構. * 注意TuplesortInstrumentation不能包含指針因為有時候會把該結構體放在共享內存中. */ typedef enum { SORT_TYPE_STILL_IN_PROGRESS = 0,//仍然在排序中 SORT_TYPE_TOP_N_HEAPSORT,//TOP N 堆排序 SORT_TYPE_QUICKSORT,//快速排序 SORT_TYPE_EXTERNAL_SORT,//外部排序 SORT_TYPE_EXTERNAL_MERGE//外部排序后的合并 } TuplesortMethod;//排序方法 typedef enum { SORT_SPACE_TYPE_DISK,//需要用上磁盤 SORT_SPACE_TYPE_MEMORY//使用內存 } TuplesortSpaceType; typedef struct TuplesortInstrumentation { //使用的排序算法 TuplesortMethod sortMethod; /* sort algorithm used */ //排序使用空間類型 TuplesortSpaceType spaceType; /* type of space spaceUsed represents */ //空間消耗(以K為單位) long spaceUsed; /* space consumption, in kB */ } TuplesortInstrumentation;
tuplesort_performsort是排序的實現.
/* * All tuples have been provided; finish the sort. * 已存在元組,執行排序! */ void tuplesort_performsort(Tuplesortstate *state) { MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext); #ifdef TRACE_SORT if (trace_sort) elog(LOG, "performsort of worker %d starting: %s", state->worker, pg_rusage_show(&state->ru_start)); #endif //根據狀態執行不同的邏輯 switch (state->status) { case TSS_INITIAL: /* * We were able to accumulate all the tuples within the allowed * amount of memory, or leader to take over worker tapes * 可以在允許的內存大小中積累所有的元組,或者讓協調者接管工作tapes. */ if (SERIAL(state)) { /* Just qsort 'em and we're done */ //快速排序 tuplesort_sort_memtuples(state); state->status = TSS_SORTEDINMEM; } else if (WORKER(state)) { /* * Parallel workers must still dump out tuples to tape. No * merge is required to produce single output run, though. * 并行worker必須dump元組到磁盤上. * 但是,生成單個輸出運行不需要合并. */ inittapes(state, false); dumptuples(state, true); worker_nomergeruns(state); state->status = TSS_SORTEDONTAPE; } else { /* * Leader will take over worker tapes and merge worker runs. * Note that mergeruns sets the correct state->status. * 并行協調器會接管工作進程的數據并合并工作線程運行. * 注意mergeruns會設置正確的狀態:state->status */ leader_takeover_tapes(state); mergeruns(state); } state->current = 0; state->eof_reached = false; state->markpos_block = 0L; state->markpos_offset = 0; state->markpos_eof = false; break; case TSS_BOUNDED://堆排序 /* * We were able to accumulate all the tuples required for output * in memory, using a heap to eliminate excess tuples. Now we * have to transform the heap to a properly-sorted array. * 使用堆來消除多余的元組,在內存可以積累所有的元組用于輸出. * 現在我們必須轉換堆為已排序的數組. */ sort_bounded_heap(state); state->current = 0; state->eof_reached = false; state->markpos_offset = 0; state->markpos_eof = false; state->status = TSS_SORTEDINMEM; break; case TSS_BUILDRUNS: /* * Finish tape-based sort. First, flush all tuples remaining in * memory out to tape; then merge until we have a single remaining * run (or, if !randomAccess and !WORKER(), one run per tape). * Note that mergeruns sets the correct state->status. * 完成tape-based排序. * 首先刷新所有在內存的元組到tape(持久化存儲)上,然后合并直至只留下一個在運行. * (否則,如果!randomAccess 且 !WORKER(),一個tape運行一次) */ //全部刷到磁盤上 dumptuples(state, true); //合并執行 mergeruns(state); state->eof_reached = false; state->markpos_block = 0L; state->markpos_offset = 0; state->markpos_eof = false; break; default: elog(ERROR, "invalid tuplesort state"); break; } #ifdef TRACE_SORT if (trace_sort) { if (state->status == TSS_FINALMERGE) elog(LOG, "performsort of worker %d done (except %d-way final merge): %s", state->worker, state->activeTapes, pg_rusage_show(&state->ru_start)); else elog(LOG, "performsort of worker %d done: %s", state->worker, pg_rusage_show(&state->ru_start)); } #endif MemoryContextSwitchTo(oldcontext); }
測試腳本
select * from t_sort order by c1,c2;
跟蹤分析
(gdb) b tuplesort_begin_heap Breakpoint 1 at 0xa6ffa1: file tuplesort.c, line 812. (gdb) b tuplesort_puttupleslot Breakpoint 2 at 0xa7119d: file tuplesort.c, line 1436. (gdb) b tuplesort_performsort Breakpoint 3 at 0xa71f45: file tuplesort.c, line 1792. (gdb) c Continuing. Breakpoint 1, tuplesort_begin_heap (tupDesc=0x208fa40, nkeys=2, attNums=0x2081858, sortOperators=0x2081878, sortCollations=0x2081898, nullsFirstFlags=0x20818b8, workMem=4096, coordinate=0x0, randomAccess=false) at tuplesort.c:812 812 Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate, (gdb)
tuplesort_begin_heap
輸入參數
(gdb) p *tupDesc $1 = {natts = 7, tdtypeid = 2249, tdtypmod = -1, tdhasoid = false, tdrefcount = -1, constr = 0x0, attrs = 0x208fa60} (gdb) p *tupDesc->attrs $2 = {attrelid = 0, attname = {data = '\000' <repeats 63 times>}, atttypid = 1043, attstattarget = -1, attlen = -1, attnum = 1, attndims = 0, attcacheoff = -1, atttypmod = 24, attbyval = false, attstorage = 120 'x', attalign = 105 'i', attnotnull = false, atthasdef = false, atthasmissing = false, attidentity = 0 '\000', attisdropped = false, attislocal = true, attinhcount = 0, attcollation = 100} (gdb) p *attNums $3 = 2 (gdb) p *sortOperators $4 = 97 (gdb) p *sortCollations $5 = 0 (gdb) p nullsFirstFlags $6 = (_Bool *) 0x20818b8 (gdb) p *nullsFirstFlags $7 = false (gdb)
獲取排序狀態,status = TSS_INITIAL
(gdb) p *state $8 = {status = TSS_INITIAL, nKeys = 0, randomAccess = false, bounded = false, boundUsed = false, bound = 0, tuples = true, availMem = 4169704, allowedMem = 4194304, maxTapes = 0, tapeRange = 0, sortcontext = 0x2093290, tuplecontext = 0x20992c0, tapeset = 0x0, comparetup = 0x0, copytup = 0x0, writetup = 0x0, readtup = 0x0, memtuples = 0x209b310, memtupcount = 0, memtupsize = 1024, growmemtuples = true, slabAllocatorUsed = false, slabMemoryBegin = 0x0, slabMemoryEnd = 0x0, slabFreeHead = 0x0, read_buffer_size = 0, lastReturnedTuple = 0x0, currentRun = 0, mergeactive = 0x0, Level = 0, destTape = 0, tp_fib = 0x0, tp_runs = 0x0, tp_dummy = 0x0, tp_tapenum = 0x0, activeTapes = 0, result_tape = -1, current = 0, eof_reached = false, markpos_block = 0, markpos_offset = 0, markpos_eof = false, worker = -1, shared = 0x0, nParticipants = -1, tupDesc = 0x0, sortKeys = 0x0, onlyKey = 0x0, abbrevNext = 0, indexInfo = 0x0, estate = 0x0, heapRel = 0x0, indexRel = 0x0, enforceUnique = false, high_mask = 0, low_mask = 0, max_buckets = 0, datumType = 0, datumTypeLen = 0, ru_start = {tv = {tv_sec = 0, tv_usec = 0}, ru = {ru_utime = {tv_sec = 0, tv_usec = 0}, ru_stime = { tv_sec = 0, tv_usec = 0}, {ru_maxrss = 0, __ru_maxrss_word = 0}, {ru_ixrss = 0, __ru_ixrss_word = 0}, { ru_idrss = 0, __ru_idrss_word = 0}, {ru_isrss = 0, __ru_isrss_word = 0}, {ru_minflt = 0, __ru_minflt_word = 0}, { ru_majflt = 0, __ru_majflt_word = 0}, {ru_nswap = 0, __ru_nswap_word = 0}, {ru_inblock = 0, __ru_inblock_word = 0}, {ru_oublock = 0, __ru_oublock_word = 0}, {ru_msgsnd = 0, __ru_msgsnd_word = 0}, {ru_msgrcv = 0, __ru_msgrcv_word = 0}, {ru_nsignals = 0, __ru_nsignals_word = 0}, {ru_nvcsw = 0, __ru_nvcsw_word = 0}, { ru_nivcsw = 0, __ru_nivcsw_word = 0}}}}
設置運行狀態
(gdb) n 819 AssertArg(nkeys > 0); (gdb) 822 if (trace_sort) (gdb) 828 state->nKeys = nkeys; (gdb) 830 TRACE_POSTGRESQL_SORT_START(HEAP_SORT, (gdb) 837 state->comparetup = comparetup_heap; (gdb) 838 state->copytup = copytup_heap; (gdb) 839 state->writetup = writetup_heap; (gdb) 840 state->readtup = readtup_heap; (gdb) 842 state->tupDesc = tupDesc; /* assume we need not copy tupDesc */ (gdb) 843 state->abbrevNext = 10; (gdb) 846 state->sortKeys = (SortSupport) palloc0(nkeys * sizeof(SortSupportData)); (gdb) 848 for (i = 0; i < nkeys; i++) (gdb) p *state $9 = {status = TSS_INITIAL, nKeys = 2, randomAccess = false, bounded = false, boundUsed = false, bound = 0, tuples = true, availMem = 4169704, allowedMem = 4194304, maxTapes = 0, tapeRange = 0, sortcontext = 0x2093290, tuplecontext = 0x20992c0, tapeset = 0x0, comparetup = 0xa7525b <comparetup_heap>, copytup = 0xa76247 <copytup_heap>, writetup = 0xa76de1 <writetup_heap>, readtup = 0xa76ec6 <readtup_heap>, memtuples = 0x209b310, memtupcount = 0, memtupsize = 1024, growmemtuples = true, slabAllocatorUsed = false, slabMemoryBegin = 0x0, slabMemoryEnd = 0x0, slabFreeHead = 0x0, read_buffer_size = 0, lastReturnedTuple = 0x0, currentRun = 0, mergeactive = 0x0, Level = 0, destTape = 0, tp_fib = 0x0, tp_runs = 0x0, tp_dummy = 0x0, tp_tapenum = 0x0, activeTapes = 0, result_tape = -1, current = 0, eof_reached = false, markpos_block = 0, markpos_offset = 0, markpos_eof = false, worker = -1, shared = 0x0, nParticipants = -1, tupDesc = 0x208fa40, sortKeys = 0x20937c0, onlyKey = 0x0, abbrevNext = 10, indexInfo = 0x0, estate = 0x0, heapRel = 0x0, indexRel = 0x0, enforceUnique = false, high_mask = 0, low_mask = 0, max_buckets = 0, datumType = 0, datumTypeLen = 0, ru_start = {tv = {tv_sec = 0, tv_usec = 0}, ru = {ru_utime = {tv_sec = 0, tv_usec = 0}, ru_stime = {tv_sec = 0, tv_usec = 0}, {ru_maxrss = 0, __ru_maxrss_word = 0}, {ru_ixrss = 0, __ru_ixrss_word = 0}, { ru_idrss = 0, __ru_idrss_word = 0}, {ru_isrss = 0, __ru_isrss_word = 0}, {ru_minflt = 0, __ru_minflt_word = 0}, { ru_majflt = 0, __ru_majflt_word = 0}, {ru_nswap = 0, __ru_nswap_word = 0}, {ru_inblock = 0, __ru_inblock_word = 0}, {ru_oublock = 0, __ru_oublock_word = 0}, {ru_msgsnd = 0, __ru_msgsnd_word = 0}, {ru_msgrcv = 0, __ru_msgrcv_word = 0}, {ru_nsignals = 0, __ru_nsignals_word = 0}, {ru_nvcsw = 0, __ru_nvcsw_word = 0}, { ru_nivcsw = 0, __ru_nivcsw_word = 0}}}} (gdb)
為每一列(c1&c2)準備SortSupport數據(分配內存空間)
(gdb) n 850 SortSupport sortKey = state->sortKeys + i; (gdb) 852 AssertArg(attNums[i] != 0); (gdb) p *state->sortKeys $10 = {ssup_cxt = 0x0, ssup_collation = 0, ssup_reverse = false, ssup_nulls_first = false, ssup_attno = 0, ssup_extra = 0x0, comparator = 0x0, abbreviate = false, abbrev_converter = 0x0, abbrev_abort = 0x0, abbrev_full_comparator = 0x0} (gdb) n 853 AssertArg(sortOperators[i] != 0); (gdb) 855 sortKey->ssup_cxt = CurrentMemoryContext; (gdb) 856 sortKey->ssup_collation = sortCollations[i]; (gdb) 857 sortKey->ssup_nulls_first = nullsFirstFlags[i]; (gdb) 858 sortKey->ssup_attno = attNums[i]; (gdb) 860 sortKey->abbreviate = (i == 0); (gdb) 862 PrepareSortSupportFromOrderingOp(sortOperators[i], sortKey); (gdb) 848 for (i = 0; i < nkeys; i++) (gdb) 850 SortSupport sortKey = state->sortKeys + i; (gdb) 852 AssertArg(attNums[i] != 0); (gdb) 853 AssertArg(sortOperators[i] != 0); (gdb) 855 sortKey->ssup_cxt = CurrentMemoryContext; (gdb) 856 sortKey->ssup_collation = sortCollations[i]; (gdb) 857 sortKey->ssup_nulls_first = nullsFirstFlags[i]; (gdb) 858 sortKey->ssup_attno = attNums[i]; (gdb) 860 sortKey->abbreviate = (i == 0); (gdb) 862 PrepareSortSupportFromOrderingOp(sortOperators[i], sortKey); (gdb) 848 for (i = 0; i < nkeys; i++) (gdb)
完成初始化,返回state
(gdb) 871 if (nkeys == 1 && !state->sortKeys->abbrev_converter) (gdb) n 874 MemoryContextSwitchTo(oldcontext); (gdb) 876 return state; (gdb) p *state $11 = {status = TSS_INITIAL, nKeys = 2, randomAccess = false, bounded = false, boundUsed = false, bound = 0, tuples = true, availMem = 4169704, allowedMem = 4194304, maxTapes = 0, tapeRange = 0, sortcontext = 0x2093290, tuplecontext = 0x20992c0, tapeset = 0x0, comparetup = 0xa7525b <comparetup_heap>, copytup = 0xa76247 <copytup_heap>, writetup = 0xa76de1 <writetup_heap>, readtup = 0xa76ec6 <readtup_heap>, memtuples = 0x209b310, memtupcount = 0, memtupsize = 1024, growmemtuples = true, slabAllocatorUsed = false, slabMemoryBegin = 0x0, slabMemoryEnd = 0x0, slabFreeHead = 0x0, read_buffer_size = 0, lastReturnedTuple = 0x0, currentRun = 0, mergeactive = 0x0, Level = 0, destTape = 0, tp_fib = 0x0, tp_runs = 0x0, tp_dummy = 0x0, tp_tapenum = 0x0, activeTapes = 0, result_tape = -1, current = 0, eof_reached = false, markpos_block = 0, markpos_offset = 0, markpos_eof = false, worker = -1, shared = 0x0, nParticipants = -1, tupDesc = 0x208fa40, sortKeys = 0x20937c0, onlyKey = 0x0, abbrevNext = 10, indexInfo = 0x0, estate = 0x0, heapRel = 0x0, indexRel = 0x0, enforceUnique = false, high_mask = 0, low_mask = 0, max_buckets = 0, datumType = 0, datumTypeLen = 0, ru_start = {tv = {tv_sec = 0, tv_usec = 0}, ru = {ru_utime = {tv_sec = 0, tv_usec = 0}, ru_stime = {tv_sec = 0, tv_usec = 0}, {ru_maxrss = 0, __ru_maxrss_word = 0}, {ru_ixrss = 0, __ru_ixrss_word = 0}, { ru_idrss = 0, __ru_idrss_word = 0}, {ru_isrss = 0, __ru_isrss_word = 0}, {ru_minflt = 0, __ru_minflt_word = 0}, { ru_majflt = 0, __ru_majflt_word = 0}, {ru_nswap = 0, __ru_nswap_word = 0}, {ru_inblock = 0, __ru_inblock_word = 0}, {ru_oublock = 0, __ru_oublock_word = 0}, {ru_msgsnd = 0, __ru_msgsnd_word = 0}, {ru_msgrcv = 0, __ru_msgrcv_word = 0}, {ru_nsignals = 0, __ru_nsignals_word = 0}, {ru_nvcsw = 0, __ru_nvcsw_word = 0}, { ru_nivcsw = 0, __ru_nivcsw_word = 0}}}} (gdb)
tuplesort_puttupleslot
出現在循環中
for (;;) { //從outer plan中獲取元組 slot = ExecProcNode(outerNode); if (TupIsNull(slot)) break;//直至全部獲取完畢 //排序 tuplesort_puttupleslot(tuplesortstate, slot); }
以其中一個slot為例說明
(gdb) c Continuing. Breakpoint 2, tuplesort_puttupleslot (state=0x20933a8, slot=0x208f8c8) at tuplesort.c:1436 1436 MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
輸入參數,state為先前調用begin_heap返回的state,slot為outer node返回的元組slot
(gdb) p *slot $12 = {type = T_TupleTableSlot, tts_isempty = false, tts_shouldFree = false, tts_shouldFreeMin = false, tts_slow = false, tts_tuple = 0x2090678, tts_tupleDescriptor = 0x7f061a300380, tts_mcxt = 0x208f270, tts_buffer = 103, tts_nvalid = 0, tts_values = 0x208f928, tts_isnull = 0x208f960, tts_mintuple = 0x0, tts_minhdr = {t_len = 0, t_self = {ip_blkid = { bi_hi = 0, bi_lo = 0}, ip_posid = 0}, t_tableOid = 0, t_data = 0x0}, tts_off = 0, tts_fixedTupleDescriptor = true} (gdb)
slot中的元組數據
(gdb) p *slot->tts_values $13 = 0 (gdb) p *slot->tts_tuple $14 = {t_len = 56, t_self = {ip_blkid = {bi_hi = 0, bi_lo = 0}, ip_posid = 1}, t_tableOid = 286759, t_data = 0x7f05ee0c4648} (gdb) p *slot->tts_tuple->t_data $15 = {t_choice = {t_heap = {t_xmin = 839, t_xmax = 0, t_field3 = {t_cid = 0, t_xvac = 0}}, t_datum = {datum_len_ = 839, datum_typmod = 0, datum_typeid = 0}}, t_ctid = {ip_blkid = {bi_hi = 0, bi_lo = 0}, ip_posid = 1}, t_infomask2 = 7, t_infomask = 2306, t_hoff = 24 '\030', t_bits = 0x7f05ee0c465f ""} (gdb) p *slot->tts_tuple->t_data->t_bits $16 = 0 '\000' (gdb) x/16ux *slot->tts_tuple->t_data->t_bits 0x0: Cannot access memory at address 0x0 (gdb) x/16ux slot->tts_tuple->t_data->t_bits 0x7f05ee0c465f: 0x5a470b00 0x00003130 0x00000100 0x00000100 0x7f05ee0c466f: 0x00000100 0x00000100 0x00000100 0x00000100 0x7f05ee0c467f: 0x00000000 0x8f282800 0x000000da 0x40023800 0x7f05ee0c468f: 0x04200002 0x00000020 0x709fc800 0x709f9000 (gdb) x/16bx slot->tts_tuple->t_data->t_bits 0x7f05ee0c465f: 0x00 0x0b 0x47 0x5a 0x30 0x31 0x00 0x00 0x7f05ee0c4667: 0x00 0x01 0x00 0x00 0x00 0x01 0x00 0x00 (gdb) x/16bc slot->tts_tuple->t_data->t_bits 0x7f05ee0c465f: 0 '\000' 11 '\v' 71 'G' 90 'Z' 48 '0' 49 '1' 0 '\000' 0 '\000' 0x7f05ee0c4667: 0 '\000' 1 '\001' 0 '\000' 0 '\000' 0 '\000' 1 '\001' 0 '\000' 0 '\000' (gdb) p *slot->tts_tupleDescriptor $17 = {natts = 7, tdtypeid = 286761, tdtypmod = -1, tdhasoid = false, tdrefcount = 2, constr = 0x0, attrs = 0x7f061a3003a0} (gdb) p *slot $18 = {type = T_TupleTableSlot, tts_isempty = false, tts_shouldFree = false, tts_shouldFreeMin = false, tts_slow = false, tts_tuple = 0x2090678, tts_tupleDescriptor = 0x7f061a300380, tts_mcxt = 0x208f270, tts_buffer = 103, tts_nvalid = 0, tts_values = 0x208f928, tts_isnull = 0x208f960, tts_mintuple = 0x0, tts_minhdr = {t_len = 0, t_self = {ip_blkid = { bi_hi = 0, bi_lo = 0}, ip_posid = 0}, t_tableOid = 0, t_data = 0x0}, tts_off = 0, tts_fixedTupleDescriptor = true} (gdb) p *slot->tts_values[0] Cannot access memory at address 0x0 (gdb) p slot->tts_values[0] $19 = 0 (gdb) x/32bc slot->tts_tuple->t_data->t_bits 0x7f05ee0c465f: 0 '\000' 11 '\v' 71 'G' 90 'Z' 48 '0' 49 '1' 0 '\000' 0 '\000' 0x7f05ee0c4667: 0 '\000' 1 '\001' 0 '\000' 0 '\000' 0 '\000' 1 '\001' 0 '\000' 0 '\000' 0x7f05ee0c466f: 0 '\000' 1 '\001' 0 '\000' 0 '\000' 0 '\000' 1 '\001' 0 '\000' 0 '\000' 0x7f05ee0c4677: 0 '\000' 1 '\001' 0 '\000' 0 '\000' 0 '\000' 1 '\001' 0 '\000' 0 '\000' (gdb) x/32bx slot->tts_tuple->t_data->t_bits 0x7f05ee0c465f: 0x00 0x0b 0x47 0x5a 0x30 0x31 0x00 0x00 0x7f05ee0c4667: 0x00 0x01 0x00 0x00 0x00 0x01 0x00 0x00 0x7f05ee0c466f: 0x00 0x01 0x00 0x00 0x00 0x01 0x00 0x00 0x7f05ee0c4677: 0x00 0x01 0x00 0x00 0x00 0x01 0x00 0x00
拷貝元組,并放到state->memtuples中
(gdb) n 1443 COPYTUP(state, &stup, (void *) slot); (gdb) 1445 puttuple_common(state, &stup); (gdb) step puttuple_common (state=0x20933a8, tuple=0x7ffe890e0b00) at tuplesort.c:1639 1639 Assert(!LEADER(state)); (gdb) n 1641 switch (state->status) (gdb) p state->status $20 = TSS_INITIAL (gdb) n 1652 if (state->memtupcount >= state->memtupsize - 1) (gdb) p state->memtupcount $21 = 0 (gdb) p state->memtupsize - 1 $22 = 1023 (gdb) n 1657 state->memtuples[state->memtupcount++] = *tuple; (gdb) 1671 if (state->bounded && (gdb) p state->bounded $23 = false (gdb) n 1688 if (state->memtupcount < state->memtupsize && !LACKMEM(state)) (gdb) 1689 return; (gdb) 1743 } (gdb) tuplesort_puttupleslot (state=0x20933a8, slot=0x208f8c8) at tuplesort.c:1447 1447 MemoryContextSwitchTo(oldcontext); (gdb) 1448 } (gdb) (gdb) p state->memtuples[0] $25 = {tuple = 0x20993d8, datum1 = 1, isnull1 = false, tupindex = 0}
tuplesort_performsort
(gdb) info b Num Type Disp Enb Address What 1 breakpoint keep y 0x0000000000a6ffa1 in tuplesort_begin_heap at tuplesort.c:812 breakpoint already hit 1 time 2 breakpoint keep y 0x0000000000a7119d in tuplesort_puttupleslot at tuplesort.c:1436 breakpoint already hit 1 time 3 breakpoint keep y 0x0000000000a71f45 in tuplesort_performsort at tuplesort.c:1792 (gdb) del 2 (gdb) c Continuing. Breakpoint 3, tuplesort_performsort (state=0x20933a8) at tuplesort.c:1792 1792 MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext); (gdb)
輸入參數
(gdb) p *state $27 = {status = TSS_BUILDRUNS, nKeys = 2, randomAccess = false, bounded = false, boundUsed = false, bound = 0, tuples = true, availMem = 824360, allowedMem = 4194304, maxTapes = 16, tapeRange = 15, sortcontext = 0x2093290, tuplecontext = 0x20992c0, tapeset = 0x2093a00, comparetup = 0xa7525b <comparetup_heap>, copytup = 0xa76247 <copytup_heap>, writetup = 0xa76de1 <writetup_heap>, readtup = 0xa76ec6 <readtup_heap>, memtuples = 0x2611570, memtupcount = 26592, memtupsize = 37448, growmemtuples = false, slabAllocatorUsed = false, slabMemoryBegin = 0x0, slabMemoryEnd = 0x0, slabFreeHead = 0x0, read_buffer_size = 0, lastReturnedTuple = 0x0, currentRun = 2, mergeactive = 0x2093878, Level = 1, destTape = 2, tp_fib = 0x20938a0, tp_runs = 0x20938f8, tp_dummy = 0x2093950, tp_tapenum = 0x20939a8, activeTapes = 0, result_tape = -1, current = 0, eof_reached = false, markpos_block = 0, markpos_offset = 0, markpos_eof = false, worker = -1, shared = 0x0, nParticipants = -1, tupDesc = 0x208fa40, sortKeys = 0x20937c0, onlyKey = 0x0, abbrevNext = 10, indexInfo = 0x0, estate = 0x0, heapRel = 0x0, indexRel = 0x0, enforceUnique = false, high_mask = 0, low_mask = 0, max_buckets = 0, datumType = 0, datumTypeLen = 0, ru_start = {tv = {tv_sec = 0, tv_usec = 0}, ru = {ru_utime = {tv_sec = 0, tv_usec = 0}, ru_stime = {tv_sec = 0, tv_usec = 0}, {ru_maxrss = 0, __ru_maxrss_word = 0}, {ru_ixrss = 0, __ru_ixrss_word = 0}, {ru_idrss = 0, __ru_idrss_word = 0}, {ru_isrss = 0, __ru_isrss_word = 0}, {ru_minflt = 0, __ru_minflt_word = 0}, {ru_majflt = 0, __ru_majflt_word = 0}, {ru_nswap = 0, __ru_nswap_word = 0}, {ru_inblock = 0, __ru_inblock_word = 0}, { ru_oublock = 0, __ru_oublock_word = 0}, {ru_msgsnd = 0, __ru_msgsnd_word = 0}, {ru_msgrcv = 0, __ru_msgrcv_word = 0}, {ru_nsignals = 0, __ru_nsignals_word = 0}, {ru_nvcsw = 0, __ru_nvcsw_word = 0}, { ru_nivcsw = 0, __ru_nivcsw_word = 0}}}} (gdb) p state->memtupsize $28 = 37448 (gdb)
state->status狀態已切換為TSS_BUILDRUNS
(gdb) n 1795 if (trace_sort) (gdb) 1800 switch (state->status) (gdb) p state->status $29 = TSS_BUILDRUNS (gdb)
全部刷到磁盤上,歸并排序
(gdb) n 1864 dumptuples(state, true); (gdb) 1865 mergeruns(state); (gdb) 1866 state->eof_reached = false; (gdb) 1867 state->markpos_block = 0L; (gdb) 1868 state->markpos_offset = 0; (gdb) 1869 state->markpos_eof = false; (gdb) 1870 break; (gdb) 1878 if (trace_sort) (gdb) 1890 MemoryContextSwitchTo(oldcontext); (gdb) 1891 } (gdb)
到此,相信大家對“怎么使用PostgreSQL的tuplesort_performsort函數”有了更深的了解,不妨來實際操作一番吧!這里是億速云網站,更多相關內容可以進入相關頻道進行查詢,關注我們,繼續學習!
免責聲明:本站發布的內容(圖片、視頻和文字)以原創、轉載和分享為主,文章觀點不代表本網站立場,如果涉及侵權請聯系站長郵箱:is@yisu.com進行舉報,并提供相關證據,一經查實,將立刻刪除涉嫌侵權內容。