Class MapReduce
Implements a simplistic version of the popular Map-Reduce algorithm. Acts like an iterator for the original passed data after each result has been processed, thus offering a transparent wrapper for results coming from any source.
Properties summary
- $_counter protected
int
Count of elements emitted during the Reduce phase
- $_data protected
\Traversable
Holds the original data that needs to be processed
- $_executed protected
bool
Whether the Map-Reduce routine has been executed already on the data
- $_intermediate protected
array
Holds the shuffled results that were emitted from the map phase
- $_mapper protected
callable
A callable that will be executed for each record in the original data
- $_reducer protected
callable|null
A callable that will be executed for each intermediate record emitted during the Map phase
- $_result protected
array
Holds the results as emitted during the reduce phase
Method Summary
- _execute() protected
Runs the actual Map-Reduce algorithm. This is iterate the original data and call the mapper function for each , then for each intermediate bucket created during the Map phase call the reduce function.
- emit() public
Appends a new record to the final list of results and optionally assign a key for this record.
- emitIntermediate() public
Appends a new record to the bucket labelled with $key, usually as a result of mapping a single record from the original data.
- getIterator() public
Returns an iterator with the end result of running the Map and Reduce phases on the original data
Method Detail
__construct() public
__construct(\Traversable $data, callable $mapper, ?callable $reducer)
Constructor
Example:
Separate all unique odd and even numbers in an array
$data = new \ArrayObject([1, 2, 3, 4, 5, 3]); $mapper = function ($value, $key, $mr) { $type = ($value % 2 === 0) ? 'even' : 'odd'; $mr->emitIntermediate($value, $type); }; $reducer = function ($numbers, $type, $mr) { $mr->emit(array_unique($numbers), $type); }; $results = new MapReduce($data, $mapper, $reducer);
Previous example will generate the following result:
['odd' => [1, 3, 5], 'even' => [2, 4]]
Parameters
-
\Traversable
$data the original data to be processed
-
callable
$mapper the mapper callback. This function will receive 3 arguments. The first one is the current value, second the current results key and third is this class instance so you can call the result emitters.
-
callable|null
$reducer optional the reducer callback. This function will receive 3 arguments. The first one is the list of values inside a bucket, second one is the name of the bucket that was created during the mapping phase and third one is an instance of this class.
_execute() protected
_execute()
Runs the actual Map-Reduce algorithm. This is iterate the original data and call the mapper function for each , then for each intermediate bucket created during the Map phase call the reduce function.
Throws
LogicException
if emitIntermediate was called but no reducer function was provided
emit() public
emit(mixed $val, mixed $key)
Appends a new record to the final list of results and optionally assign a key for this record.
Parameters
-
mixed
$val The value to be appended to the final list of results
-
mixed
$key optional and optional key to assign to the value
emitIntermediate() public
emitIntermediate(mixed $val, mixed $bucket)
Appends a new record to the bucket labelled with $key, usually as a result of mapping a single record from the original data.
Parameters
-
mixed
$val The record itself to store in the bucket
-
mixed
$bucket the name of the bucket where to put the record
getIterator() public
getIterator()
Returns an iterator with the end result of running the Map and Reduce phases on the original data
Returns
\Traversable
Property Detail
$_counter protected
Count of elements emitted during the Reduce phase
Type
int
$_data protected
Holds the original data that needs to be processed
Type
\Traversable
$_executed protected
Whether the Map-Reduce routine has been executed already on the data
Type
bool
$_intermediate protected
Holds the shuffled results that were emitted from the map phase
Type
array
$_mapper protected
A callable that will be executed for each record in the original data
Type
callable
$_reducer protected
A callable that will be executed for each intermediate record emitted during the Map phase
Type
callable|null
$_result protected
Holds the results as emitted during the reduce phase
Type
array
© 2005–present The Cake Software Foundation, Inc.
Licensed under the MIT License.
CakePHP is a registered trademark of Cake Software Foundation, Inc.
We are not endorsed by or affiliated with CakePHP.
https://api.cakephp.org/4.1/class-Cake.Collection.Iterator.MapReduce.html