Class MapReduce
Implements a simplistic version of the popular Map-Reduce algorithm. Acts like an iterator for the original passed data after each result has been processed, thus offering a transparent wrapper for results coming from any source.
- Cake\Collection\Iterator\MapReduce implements IteratorAggregate
Properties summary
-
$_counter
protectedCount of elements emitted during the Reduce phasestring
-
$_data
protectedHolds the original data that needs to be processedTraversable
-
$_executed
protectedWhether the Map-Reduce routine has been executed already on the databoolean
-
$_intermediate
protectedarray
Holds the shuffled results that were emitted from the map phase
-
$_mapper
protectedA callable that will be executed for each record in the original datacallable
-
$_reducer
protectedcallable
A callable that will be executed for each intermediate record emitted during the Map phase
-
$_result
protectedHolds the results as emitted during the reduce phasearray
Method Summary
- __construct() publicConstructor
- _execute() protected
Runs the actual Map-Reduce algorithm. This is iterate the original data and call the mapper function for each , then for each intermediate bucket created during the Map phase call the reduce function.
- emit() public
Appends a new record to the final list of results and optionally assign a key for this record.
- emitIntermediate() public
Appends a new record to the bucket labelled with $key, usually as a result of mapping a single record from the original data.
- getIterator() public
Returns an iterator with the end result of running the Map and Reduce phases on the original data
Method Detail
__construct()source public
__construct( Traversable $data , callable $mapper , callable $reducer null )
Constructor
Example:
Separate all unique odd and even numbers in an array
$data = new \ArrayObject([1, 2, 3, 4, 5, 3]); $mapper = function ($value, $key, $mr) { $type = ($value % 2 === 0) ? 'even' : 'odd'; $mr->emitIntermediate($value, $type); }; $reducer = function ($numbers, $type, $mr) { $mr->emit(array_unique($numbers), $type); }; $results = new MapReduce($data, $mapper, $reducer);
Previous example will generate the following result:
['odd' => [1, 3, 5], 'even' => [2, 4]]
Parameters
- Traversable
$data
- the original data to be processed
- callable
$mapper
the mapper callback. This function will receive 3 arguments. The first one is the current value, second the current results key and third is this class instance so you can call the result emitters.
- callable
$reducer
optional null the reducer callback. This function will receive 3 arguments. The first one is the list of values inside a bucket, second one is the name of the bucket that was created during the mapping phase and third one is an instance of this class.
_execute()source protected
_execute( )
Runs the actual Map-Reduce algorithm. This is iterate the original data and call the mapper function for each , then for each intermediate bucket created during the Map phase call the reduce function.
Throws
LogicExceptionif emitIntermediate was called but no reducer function was provided
emit()source public
emit( mixed $value , string|null $key null )
Appends a new record to the final list of results and optionally assign a key for this record.
Parameters
- mixed
$value
- The value to be appended to the final list of results
- string|null
$key
optional null - and optional key to assign to the value
emitIntermediate()source public
emitIntermediate( mixed $value , string $bucket )
Appends a new record to the bucket labelled with $key, usually as a result of mapping a single record from the original data.
Parameters
- mixed
$value
- The record itself to store in the bucket
- string
$bucket
- the name of the bucket where to put the record
getIterator()source public
getIterator( )
Returns an iterator with the end result of running the Map and Reduce phases on the original data
Returns
ArrayIteratorImplementation of
IteratorAggregate::getIterator()
Properties detail
$_executedsource
protected boolean
Whether the Map-Reduce routine has been executed already on the data
false
$_intermediatesource
protected array
Holds the shuffled results that were emitted from the map phase
[]
$_mappersource
protected callable
A callable that will be executed for each record in the original data
$_reducersource
protected callable
A callable that will be executed for each intermediate record emitted during the Map phase
© 2005–2016 The Cake Software Foundation, Inc.
Licensed under the MIT License.
CakePHP is a registered trademark of Cake Software Foundation, Inc.
We are not endorsed by or affiliated with CakePHP.
http://api.cakephp.org/3.2/class-Cake.Collection.Iterator.MapReduce.html