gpt4 book ai didi

java - 根据java中的元素频率对数组元素进行排序

转载 作者:塔克拉玛干 更新时间:2023-11-03 05:50:55 25 4
gpt4 key购买 nike

我已经编写了代码,根据数组中元素出现的频率在 Java 中对数组进行排序。我需要更好的代码或伪代码(没有收集框架)。请提供链接或代码。

public class SortByFreq1 {

public static void main(String[] args) {

int arr[] = { 2, 5, 2, 8, 5, 6, 8, 8, 0, -8 };

int nArr[] = new int[arr.length];

Map<Integer,Integer> map = new HashMap<Integer, Integer>();
Map<Integer,Integer> sortmap = new HashMap<Integer, Integer>();
ArrayList<Integer> arrList = new ArrayList<Integer>();

for (int i = 0; i < arr.length; i++) {
arrList.add(arr[i]);
}

Set<Integer> set = new HashSet<Integer>(arrList);

for (Integer i : set) {
map.put(i, Collections.frequency(arrList, i));
}

// System.out.println(map.keySet());
// sort map by value

Set<Entry<Integer,Integer>> valList=map.entrySet();
ArrayList<Entry<Integer, Integer>> tempLst = new ArrayList<Map.Entry<Integer, Integer>>(valList);

Collections.sort(tempLst, new Comparator<Entry<Integer, Integer>>() {
@Override
public int compare(Entry<Integer, Integer> o1, Entry<Integer, Integer> o2) {
return o2.getValue().compareTo(o1.getValue());
}
});

int k = 0;

for (Entry<Integer, Integer> entry : tempLst) {
int no = entry.getKey();
int noOfTimes = entry.getValue();

int i = 0;

while (i < noOfTimes) {
nArr[k++] = no;
i++;
}
}

for (int i = 0; i < nArr.length; i++)
System.out.print(nArr[i] + " ");
}
}

最佳答案

其背后的逻辑与Counting Sort非常相似.

注意:我们不会修改传入的数组。

有两种不同的方法,但时间和空间复杂度几乎相同。

  • 时间复杂度:max(n, O(klogk));
  • 空间复杂度:O(n) - 要返回的数组;

k mentioned above is the amount of distinct numbers in the array.

内置收集方法

使用 Stream 也许我们可以让这个过程更干净一些,尽管 OP 没有要求这样做:

/**
* 1. count the frequency and sort the entry based on the frequency while using LinkedHashMap to retain the order;
* 2. fill up the new array based on the frequency while traversing the LinkedHashMap;
* @param arr
* @return
*/
private static int[] sortByCounting(int[] arr) {
Map<Integer, Long> countMap = Arrays.stream(arr).boxed()
.collect(Collectors.groupingBy(Integer::intValue, Collectors.counting()))
.entrySet().stream()
.sorted((e1, e2) -> e2.getValue().compareTo(e1.getValue()))
.collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue, (oldV, newV) -> oldV, LinkedHashMap::new));
int[] newArr = new int[arr.length];
int i = 0;
for (Map.Entry<Integer, Long> entry : countMap.entrySet()) {
Arrays.fill(newArr, i, i += entry.getValue().intValue(), entry.getKey());
}
return newArr;
}

自定义方法

由于我们不能使用内置 收集方法,同时我们必须记录数字的计数。

本能地,我们可以引入一个自定义对来记录number及其相关的frequency(或count我们可以说)作为我们的自定义方法


private static int[] sortByPlainCounting(int[] arr) {
if (arr.length < 1) throw new IllegalArgumentException("Array cannot be empty");
MyPair[] pairs = prepareMyPairs(arr);
Arrays.sort(pairs, Comparator.comparing(MyPair::getCount).reversed());
int[] newArr = new int[arr.length];
int i = 0;
for (MyPair pair : pairs) {
Arrays.fill(newArr, i, i += pair.count, pair.key);
}
return newArr;
}

static class MyPair {
int key;
int count;

public MyPair(int theKey) {
this.key = theKey;
this.count = 1;
}

public void inc() {
this.count++;
}

public int getCount() {
return this.count;
}
}

static MyPair[] prepareMyPairs(int[] arr) {
Integer[] tmpArr = Arrays.stream(arr).boxed().toArray(Integer[]::new);
Arrays.sort(tmpArr, Comparator.reverseOrder());
int count = 1;
int prev = tmpArr[0];
for (int i = 1; i < tmpArr.length; i++) {
if (tmpArr[i] != prev) {
prev = tmpArr[i];
count++;
}
}
MyPair[] pairs = new MyPair[count];
int k = 0;
for (int i = 0; i < tmpArr.length; i++) {
if (pairs[k] == null) {
pairs[k] = new MyPair(tmpArr[i]);
} else {
if (pairs[k].key == tmpArr[i]) {
pairs[k].inc();
} else {
k++; i--;
}
}
}
return pairs;
}

对比演示

做最后的比较,我们可以证明:

  1. custom 的平均时间成本比内置收集方法差一点(差 1.4 倍),而最坏的情况要好得多(好 4 倍);
  2. 自定义方法正确;

public static void main(String[] args) {
int N = 10_000 + new Random().nextInt(100);
Long start;
List<Long> list0 = new ArrayList<>();
List<Long> list1 = new ArrayList<>();
for (int i = 0; i < 100; ++i) {
int[] arr = RandomGenerator.generateArrays(N, N, N / 10, N / 5, false);

start = System.nanoTime();
int[] arr0 = sortByCounting(arr);
list0.add(System.nanoTime() - start);

start = System.nanoTime();
int[] arr1 = sortByPlainCounting(arr);
list1.add(System.nanoTime() - start);

System.out.println(isFrequencyEqual(arr0, arr1));
}
System.out.println("Collection time cost: " + list0.stream().collect(Collectors.summarizingLong(Long::valueOf)));
System.out.println("Custom time cost: " + list1.stream().collect(Collectors.summarizingLong(Long::valueOf)));
}


private static boolean isFrequencyEqual(int[] arr0, int[] arr1) {
Map<Integer, Long> countMap0 = getCountMap(arr0);
Map<Integer, Long> countMap1 = getCountMap(arr1);
boolean isEqual = countMap0.entrySet().size() == countMap1.entrySet().size();
if (!isEqual) return false;
isEqual = countMap0.values().containsAll(countMap1.values()) &&
countMap1.values().containsAll(countMap0.values());
if (!isEqual) return false;
List<Long> countList0 = countMap0.values().stream().collect(Collectors.toList());
List<Long> countList1 = countMap1.values().stream().collect(Collectors.toList());
for (int i = 0; i < countList0.size(); i++) {
if (countList1.get(i) != countList0.get(i)) return false;
}
return true;
}

private static Map<Integer, Long> getCountMap(int[] arr) {
return Arrays.stream(arr).boxed()
.collect(Collectors.groupingBy(Integer::intValue, Collectors.counting()))
.entrySet().stream()
.sorted((e1, e2) -> e2.getValue().compareTo(e1.getValue()))
.collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue, (oldV, newV) -> oldV, LinkedHashMap::new));
}

辅助工具方法:

public static int[] generateArrays(int minSize, int maxSize, int low, int high, boolean isUnique) {
Random random = new Random(System.currentTimeMillis());
int N = random.nextInt(maxSize - minSize + 1) + minSize;
if (isUnique) {
Set<Integer> intSet = new HashSet<>();
while (intSet.size() < N) {
intSet.add(random.nextInt(high - low) + low);
}
return intSet.stream().mapToInt(Integer::intValue).toArray();
} else {
int[] arr = new int[N];
for (int i = 0; i < N; ++i) {
arr[i] = random.nextInt(high - low) + low;
}
return arr;
}
}

测试输出:

Sorted by frequency: true
// ... another 98 same output
Sorted by frequency: true
Collection time cost: LongSummaryStatistics{count=100, sum=273531781, min=466684, average=2735317.810000, max=131741520}
Custom time cost: LongSummaryStatistics{count=100, sum=366417748, min=1733417, average=3664177.480000, max=27617114}

关于java - 根据java中的元素频率对数组元素进行排序,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51925232/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com