gpt4 book ai didi

javascript - 没有Cookie或本地存储的用户识别

转载 作者:搜寻专家 更新时间:2023-10-31 22:02:57 25 4
gpt4 key购买 nike

我正在构建一个分析工具,目前可以从其用户代理获取用户的IP地址,浏览器和操作系统。

我想知道是否有可能在不使用Cookie或本地存储的情况下检测到同一用户?我不期望在这里有代码示例。只是进一步了解的简单提示。

忘了提及,如果它是同一台计算机/设备,则需要与跨浏览器兼容。基本上我是在设备识别之后才真正不是用户。

最佳答案

简介

如果我对您的理解正确,则需要确定一个没有唯一标识符的用户,因此您想通过匹配随机数据来确定他们是谁。您不能可靠地存储用户的身份,因为:

  • Cookies可以删除
  • IP地址可以更改
  • 浏览器可以更改
  • 浏览器缓存可能已删除

  • Java Applet或Com Object使用硬件信息的哈希本来是一种简单的解决方案,但是如今人们对安全性的认识如此之深,以至于很难让人在他们的系统上安装这类程序。这使您无法使用Cookie和其他类似工具。

    Cookies和其他类似工具

    您可能会考虑先建立一个数据配置文件,然后使用概率测试来识别一个可能的用户。可以通过以下某种组合来生成对此有用的配置文件:
  • IP地址
  • 真实IP地址
  • 代理IP地址(用户经常重复使用同一代理)
  • cookies
  • HTTP cookie
  • session Cookie
  • 3rd Party Cookies
  • Flash Cookies(most people don't know how to delete these)
  • Web错误(可靠性较差,因为错误已修复,但仍然有用)
  • PDF错误
  • Flash错误
  • Java错误
  • 浏览器
  • 点击跟踪(许多用户每次访问都会访问相同系列的页面)
  • 浏览器指纹
    -已安装的插件(人们通常会使用各种各样的,有些独特的插件集)
  • 缓存的图像(人们有时会删除其Cookie,但保留缓存的图像)
  • 使用Blobs
  • URL(浏览器历史记录或cookie可能在URL中包含唯一的用户ID,例如https://stackoverflow.com/users/1226894http://www.facebook.com/barackobama?fref=ts)
  • System Fonts Detection(这是一个鲜为人知但通常唯一的密钥签名)
  • HTML5和Javascript
  • HTML5 LocalStorage
  • HTML5地理位置API和反向地理编码
  • 体系结构,OS语言,系统时间,屏幕分辨率等。
  • 网络信息API
  • 电池状态API

  • 当然,我列出的项目只是可以唯一地识别用户的几种可能方式。还有更多。

    使用这组随机数据元素来构建数据配置文件,下一步是什么?

    下一步是开发一些 Fuzzy Logic,或者更好的是开发一个 Artificial Neural Network(使用模糊逻辑)。无论哪种情况,其想法都是先训练您的系统,然后将其训练与 Bayesian Inference结合起来以提高结果的准确性。

    PHP的 NeuralMesh库允许您生成人工神经网络。要实现贝叶斯推理,请查看以下链接:
  • Implement Bayesian inference using PHP, Part 1
  • Implement Bayesian inference using PHP, Part 2
  • Implement Bayesian inference using PHP, Part 3

  • 此时,您可能正在考虑:

    为什么看似简单的任务需要那么多的数学和逻辑?

    基本上,因为这不是简单的任务。实际上,您要达到的目标是纯概率。例如,给定以下已知用户:
    User1 = A + B + C + D + G + K
    User2 = C + D + I + J + K + F

    当您收到以下数据时:
    B + C + E + G + F + K

    您实质上要问的问题是:

    接收到的数据(B + C + E + G + F + K)实际上是User1还是User2的概率是多少?那两场比赛中哪一场最有可能?

    为了有效回答这个问题,您需要了解 Frequency vs Probability Format以及为什么 Joint Probability可能是更好的方法。这里的细节太多了(这就是为什么我要给您提供链接),但是一个很好的例子是 Medical Diagnosis Wizard Application,它使用多种症状组合来识别可能的疾病。

    考虑一下一系列数据点,这些数据点包括“数据配置文件”(在上面的示例中为B + C + E + G + F + K)为症状,而“未知用户”为疾病。通过确定疾病,您可以进一步确定适当的治疗方法(将该用户视为User1)。

    显然,我们已经识别出多种症状的疾病更容易识别。实际上,我们可以确定的症状越多,几乎可以肯定我们的诊断就越容易和准确。

    还有其他选择吗?

    当然。作为一种替代方法,您可以创建自己的简单评分算法,并将其基于完全匹​​配。这不像概率那样有效,但是对您来说可能更容易实现。

    例如,考虑以下简单的得分表:
    +-------------------------+--------+------------+|        Property         | Weight | Importance |+-------------------------+--------+------------+| Real IP address         |     60 |          5 || Used proxy IP address   |     40 |          4 || HTTP Cookies            |     80 |          8 || Session Cookies         |     80 |          6 || 3rd Party Cookies       |     60 |          4 || Flash Cookies           |     90 |          7 || PDF Bug                 |     20 |          1 || Flash Bug               |     20 |          1 || Java Bug                |     20 |          1 || Frequent Pages          |     40 |          1 || Browsers Finger Print   |     35 |          2 || Installed Plugins       |     25 |          1 || Cached Images           |     40 |          3 || URL                     |     60 |          4 || System Fonts Detection  |     70 |          4 || Localstorage            |     90 |          8 || Geolocation             |     70 |          6 || AOLTR                   |     70 |          4 || Network Information API |     40 |          3 || Battery Status API      |     20 |          1 |+-------------------------+--------+------------+

    For each piece of information which you can gather on a given request, award the associated score, then use Importance to resolve conflicts when scores are the same.

    Proof of Concept

    For a simple proof of concept, please take a look at Perceptron. Perceptron is a RNA Model that is generally used in pattern recognition applications. There is even an old PHP Class which implements it perfectly, but you would likely need to modify it for your purposes.

    Despite being a great tool, Perceptron can still return multiple results (possible matches), so using a Score and Difference comparison is still useful to identify the best of those matches.

    Assumptions

    • Store all possible information about each user (IP, cookies, etc.)
    • Where result is an exact match, increase score by 1
    • Where result is not an exact match, decrease score by 1

    Expectation

    1. Generate RNA labels
    2. Generate random users emulating a database
    3. Generate a single Unknown user
    4. Generate Unknown user RNA and Values
    5. The system will merge RNA information and teach the Perceptron
    6. After training the Perceptron, the system will have a set of weightings
    7. You can now test the Unknown user's pattern and the Perceptron will produce a result set.
    8. Store all Positive matches
    9. Sort the matches first by Score, then by Difference (as described above)
    10. Output the two closest matches, or, if no matches are found, output empty results

    Code for Proof of Concept

    $features = array(
    'Real IP address' => .5,
    'Used proxy IP address' => .4,
    'HTTP Cookies' => .9,
    'Session Cookies' => .6,
    '3rd Party Cookies' => .6,
    'Flash Cookies' => .7,
    'PDF Bug' => .2,
    'Flash Bug' => .2,
    'Java Bug' => .2,
    'Frequent Pages' => .3,
    'Browsers Finger Print' => .3,
    'Installed Plugins' => .2,
    'URL' => .5,
    'Cached PNG' => .4,
    'System Fonts Detection' => .6,
    'Localstorage' => .8,
    'Geolocation' => .6,
    'AOLTR' => .4,
    'Network Information API' => .3,
    'Battery Status API' => .2
    );

    // Get RNA Lables
    $labels = array();
    $n = 1;
    foreach ($features as $k => $v) {
    $labels[$k] = "x" . $n;
    $n ++;
    }

    // Create Users
    $users = array();
    for($i = 0, $name = "A"; $i < 5; $i ++, $name ++) {
    $users[] = new Profile($name, $features);
    }

    // Generate Unknown User
    $unknown = new Profile("Unknown", $features);

    // Generate Unknown RNA
    $unknownRNA = array(
    0 => array("o" => 1),
    1 => array("o" => - 1)
    );

    // Create RNA Values
    foreach ($unknown->data as $item => $point) {
    $unknownRNA[0][$labels[$item]] = $point;
    $unknownRNA[1][$labels[$item]] = (- 1 * $point);
    }

    // Start Perception Class
    $perceptron = new Perceptron();

    // Train Results
    $trainResult = $perceptron->train($unknownRNA, 1, 1);

    // Find matches
    foreach ($users as $name => &$profile) {
    // Use shorter labels
    $data = array_combine($labels, $profile->data);
    if ($perceptron->testCase($data, $trainResult) == true) {
    $score = $diff = 0;

    // Determing the score and diffrennce
    foreach ($unknown->data as $item => $found) {
    if ($unknown->data[$item] === $profile->data[$item]) {
    if ($profile->data[$item] > 0) {
    $score += $features[$item];
    } else {
    $diff += $features[$item];
    }
    }
    }
    // Ser score and diff
    $profile->setScore($score, $diff);
    $matchs[] = $profile;
    }
    }

    // Sort bases on score and Output
    if (count($matchs) > 1) {
    usort($matchs, function ($a, $b) {
    // If score is the same use diffrence
    if ($a->score == $b->score) {
    // Lower the diffrence the better
    return $a->diff == $b->diff ? 0 : ($a->diff > $b->diff ? 1 : - 1);
    }
    // The higher the score the better
    return $a->score > $b->score ? - 1 : 1;
    });

    echo "<br />Possible Match ", implode(",", array_slice(array_map(function ($v) {
    return sprintf(" %s (%0.4f|%0.4f) ", $v->name, $v->score,$v->diff);
    }, $matchs), 0, 2));
    } else {
    echo "<br />No match Found ";
    }

    Output:
    Possible Match D (0.7416|0.16853),C (0.5393|0.2809)

    的“D”打印_r:
    echo "<pre>";
    print_r($matchs[0]);


    Profile Object(
    [name] => D
    [data] => Array (
    [Real IP address] => -1
    [Used proxy IP address] => -1
    [HTTP Cookies] => 1
    [Session Cookies] => 1
    [3rd Party Cookies] => 1
    [Flash Cookies] => 1
    [PDF Bug] => 1
    [Flash Bug] => 1
    [Java Bug] => -1
    [Frequent Pages] => 1
    [Browsers Finger Print] => -1
    [Installed Plugins] => 1
    [URL] => -1
    [Cached PNG] => 1
    [System Fonts Detection] => 1
    [Localstorage] => -1
    [Geolocation] => -1
    [AOLTR] => 1
    [Network Information API] => -1
    [Battery Status API] => -1
    )
    [score] => 0.74157303370787
    [diff] => 0.1685393258427
    [base] => 8.9
    )

    如果Debug = true,您将能够看到 Input (Sensor & Desired), Initial Weights, Output (Sensor, Sum, Network), Error, Correction and Final Weights
    +----+----+----+----+----+----+----+----+----+----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+------+-----+----+---------+---------+---------+---------+---------+---------+---------+---------+---------+----------+----------+----------+----------+----------+----------+----------+----------+----------+----------+----------+----+----+----+----+----+----+----+----+----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----------+
    | o | x1 | x2 | x3 | x4 | x5 | x6 | x7 | x8 | x9 | x10 | x11 | x12 | x13 | x14 | x15 | x16 | x17 | x18 | x19 | x20 | Bias | Yin | Y | deltaW1 | deltaW2 | deltaW3 | deltaW4 | deltaW5 | deltaW6 | deltaW7 | deltaW8 | deltaW9 | deltaW10 | deltaW11 | deltaW12 | deltaW13 | deltaW14 | deltaW15 | deltaW16 | deltaW17 | deltaW18 | deltaW19 | deltaW20 | W1 | W2 | W3 | W4 | W5 | W6 | W7 | W8 | W9 | W10 | W11 | W12 | W13 | W14 | W15 | W16 | W17 | W18 | W19 | W20 | deltaBias |
    +----+----+----+----+----+----+----+----+----+----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+------+-----+----+---------+---------+---------+---------+---------+---------+---------+---------+---------+----------+----------+----------+----------+----------+----------+----------+----------+----------+----------+----------+----+----+----+----+----+----+----+----+----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----------+
    | 1 | 1 | -1 | -1 | -1 | -1 | -1 | -1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | -1 | -1 | -1 | -1 | 1 | 1 | 1 | 0 | -1 | 0 | -1 | -1 | -1 | -1 | -1 | -1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | -1 | -1 | -1 | -1 | 1 | 1 | 0 | -1 | -1 | -1 | -1 | -1 | -1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | -1 | -1 | -1 | -1 | 1 | 1 | 1 |
    | -1 | -1 | 1 | 1 | 1 | 1 | 1 | 1 | -1 | -1 | -1 | -1 | -1 | -1 | -1 | 1 | 1 | 1 | 1 | -1 | -1 | 1 | -19 | -1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | -1 | -1 | -1 | -1 | -1 | -1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | -1 | -1 | -1 | -1 | 1 | 1 | 1 |
    | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- |
    | 1 | 1 | -1 | -1 | -1 | -1 | -1 | -1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | -1 | -1 | -1 | -1 | 1 | 1 | 1 | 19 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | -1 | -1 | -1 | -1 | -1 | -1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | -1 | -1 | -1 | -1 | 1 | 1 | 1 |
    | -1 | -1 | 1 | 1 | 1 | 1 | 1 | 1 | -1 | -1 | -1 | -1 | -1 | -1 | -1 | 1 | 1 | 1 | 1 | -1 | -1 | 1 | -19 | -1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | -1 | -1 | -1 | -1 | -1 | -1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | -1 | -1 | -1 | -1 | 1 | 1 | 1 |
    | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- |
    +----+----+----+----+----+----+----+----+----+----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+------+-----+----+---------+---------+---------+---------+---------+---------+---------+---------+---------+----------+----------+----------+----------+----------+----------+----------+----------+----------+----------+----------+----+----+----+----+----+----+----+----+----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----------+

    x1至x20表示由代码转换的要素。
    // Get RNA Labels
    $labels = array();
    $n = 1;
    foreach ( $features as $k => $v ) {
    $labels[$k] = "x" . $n;
    $n ++;
    }

    这是一个 online demo

    使用的类:
    class Profile {
    public $name, $data = array(), $score, $diff, $base;

    function __construct($name, array $importance) {
    $values = array(-1, 1); // Perception values
    $this->name = $name;
    foreach ($importance as $item => $point) {
    // Generate Random true/false for real Items
    $this->data[$item] = $values[mt_rand(0, 1)];
    }
    $this->base = array_sum($importance);
    }

    public function setScore($score, $diff) {
    $this->score = $score / $this->base;
    $this->diff = $diff / $this->base;
    }
    }

    修改的感知器类
    class Perceptron {
    private $w = array();
    private $dw = array();
    public $debug = false;

    private function initialize($colums) {
    // Initialize perceptron vars
    for($i = 1; $i <= $colums; $i ++) {
    // weighting vars
    $this->w[$i] = 0;
    $this->dw[$i] = 0;
    }
    }

    function train($input, $alpha, $teta) {
    $colums = count($input[0]) - 1;
    $weightCache = array_fill(1, $colums, 0);
    $checkpoints = array();
    $keepTrainning = true;

    // Initialize RNA vars
    $this->initialize(count($input[0]) - 1);
    $just_started = true;
    $totalRun = 0;
    $yin = 0;

    // Trains RNA until it gets stable
    while ($keepTrainning == true) {
    // Sweeps each row of the input subject
    foreach ($input as $row_counter => $row_data) {
    // Finds out the number of columns the input has
    $n_columns = count($row_data) - 1;

    // Calculates Yin
    $yin = 0;
    for($i = 1; $i <= $n_columns; $i ++) {
    $yin += $row_data["x" . $i] * $weightCache[$i];
    }

    // Calculates Real Output
    $Y = ($yin <= 1) ? - 1 : 1;

    // Sweeps columns ...
    $checkpoints[$row_counter] = 0;
    for($i = 1; $i <= $n_columns; $i ++) {
    /** DELTAS **/
    // Is it the first row?
    if ($just_started == true) {
    $this->dw[$i] = $weightCache[$i];
    $just_started = false;
    // Found desired output?
    } elseif ($Y == $row_data["o"]) {
    $this->dw[$i] = 0;
    // Calculates Delta Ws
    } else {
    $this->dw[$i] = $row_data["x" . $i] * $row_data["o"];
    }

    /** WEIGHTS **/
    // Calculate Weights
    $this->w[$i] = $this->dw[$i] + $weightCache[$i];
    $weightCache[$i] = $this->w[$i];

    /** CHECK-POINT **/
    $checkpoints[$row_counter] += $this->w[$i];
    } // END - for

    foreach ($this->w as $index => $w_item) {
    $debug_w["W" . $index] = $w_item;
    $debug_dw["deltaW" . $index] = $this->dw[$index];
    }

    // Special for script debugging
    $debug_vars[] = array_merge($row_data, array(
    "Bias" => 1,
    "Yin" => $yin,
    "Y" => $Y
    ), $debug_dw, $debug_w, array(
    "deltaBias" => 1
    ));
    } // END - foreach

    // Special for script debugging
    $empty_data_row = array();
    for($i = 1; $i <= $n_columns; $i ++) {
    $empty_data_row["x" . $i] = "--";
    $empty_data_row["W" . $i] = "--";
    $empty_data_row["deltaW" . $i] = "--";
    }
    $debug_vars[] = array_merge($empty_data_row, array(
    "o" => "--",
    "Bias" => "--",
    "Yin" => "--",
    "Y" => "--",
    "deltaBias" => "--"
    ));

    // Counts training times
    $totalRun ++;

    // Now checks if the RNA is stable already
    $referer_value = end($checkpoints);
    // if all rows match the desired output ...
    $sum = array_sum($checkpoints);
    $n_rows = count($checkpoints);
    if ($totalRun > 1 && ($sum / $n_rows) == $referer_value) {
    $keepTrainning = false;
    }
    } // END - while

    // Prepares the final result
    $result = array();
    for($i = 1; $i <= $n_columns; $i ++) {
    $result["w" . $i] = $this->w[$i];
    }

    $this->debug($this->print_html_table($debug_vars));

    return $result;
    } // END - train
    function testCase($input, $results) {
    // Sweeps input columns
    $result = 0;
    $i = 1;
    foreach ($input as $column_value) {
    // Calculates teste Y
    $result += $results["w" . $i] * $column_value;
    $i ++;
    }
    // Checks in each class the test fits
    return ($result > 0) ? true : false;
    } // END - test_class

    // Returns the html code of a html table base on a hash array
    function print_html_table($array) {
    $html = "";
    $inner_html = "";
    $table_header_composed = false;
    $table_header = array();

    // Builds table contents
    foreach ($array as $array_item) {
    $inner_html .= "<tr>\n";
    foreach ( $array_item as $array_col_label => $array_col ) {
    $inner_html .= "<td>\n";
    $inner_html .= $array_col;
    $inner_html .= "</td>\n";

    if ($table_header_composed == false) {
    $table_header[] = $array_col_label;
    }
    }
    $table_header_composed = true;
    $inner_html .= "</tr>\n";
    }

    // Builds full table
    $html = "<table border=1>\n";
    $html .= "<tr>\n";
    foreach ($table_header as $table_header_item) {
    $html .= "<td>\n";
    $html .= "<b>" . $table_header_item . "</b>";
    $html .= "</td>\n";
    }
    $html .= "</tr>\n";

    $html .= $inner_html . "</table>";

    return $html;
    } // END - print_html_table

    // Debug function
    function debug($message) {
    if ($this->debug == true) {
    echo "<b>DEBUG:</b> $message";
    }
    } // END - debug
    } // END - class

    结论

    识别没有唯一标识符的用户不是一件简单而简单的任务。它取决于收集足够数量的随机数据,您可以通过多种方法从用户那里收集这些数据。

    即使您选择不使用人工神经网络,我也建议至少使用具有优先级和可能性的简单概率矩阵-我希望上面提供的代码和示例能为您提供足够的帮助。

    关于javascript - 没有Cookie或本地存储的用户识别,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/24734278/

    25 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com